WRITING AND SOFTWARE BY IAN SHARPE
If you arrived looking for information on shuffling lists, lottery numbers and choosing random numbers that don't repeat, you may be interested in this article.
Someone who fits kitchens and bathrooms for a living asked for help with tiling: "Clients often want a random effect using different colours. You wouldn't believe how long we spend trying to get arrangements that look right. It takes a lot of time, and then the client comes home and says she can see a space invader near the cooker."
He had tried to produce random patterns with Excel so that the proposed tiling scheme could be approved by the client. Unfortunately, his efforts with the RAND function did not look very random due to clusters of tiles of the same colour:
I was asked whether these clumps are due to a bug in RAND, and if there is a way to randomly distribute a fixed stock of tiles of different colours without these shapes turning up.
Yes there is, but not by filling cells with RAND functions. RAND generates a number without reference to other cells. You can bias the result towards or away from certain numbers but this isn't the best solution to the problem.
When a client asks for a random pattern, what they say may not be what they have in mind.
Download the file Tiles.xls. Open it (with macros enabled if you're asked) and click the 'Truly random' button a few times. Watch the patterns. The chances are that parts of them will be as unordered as anyone could wish, but most of what you see has a degree of clumpiness – areas of the same colour that form shapes the brain picks out as non-random.
And yet the VBA program behind this button is producing genuinely random patterns. The way it works is to lay all available tiles in a line, thoroughly shuffle them into random order, then cut the line into strips that are laid out in rows to cover the area.
The fewer colours involved the more likely it is that a tile will fall next to another of the same colour purely by chance. The resulting clusters do not represent a bug. If there were no clumping in most of the patterns, the selection process would not be truly random. How often do the lottery numbers come out evenly spaced? Very rarely. There is normally a degree of bunching and this is exactly how a truly random system behaves.
Clumping of tile colours may be pleasing to some. Others will say it is not what they want. If you press the button enough times, the random shuffling will eventually generate every possible arrangement of tiles, including relatively un-clustered ones:
How many times? Lots and lots. The number of unique patterns you can make with four colours in a 12 x 10 grid is comparable to the number of atoms in our galaxy. That's a lot of mouse pressing and not a few lifetimes!
If you could examine all those possible patterns, in addition to all the random-looking ones, you would see every possible image that could be represented with that number of pixels in those colours. Every possible face, every possible fragment of writing in any possible language invented or not, next week's lottery numbers... anything. But since you'd have no way of picking out the relative few that appear to make sense, and no way to distinguish true information from false, it would be no use even if you had the time.
Any pattern generated in this way is random in the sense of being a random selection from all possible patterns. A few look very regular and planned. Some look totally chaotic. Between the extremes are a massive number that have a degree of clumpiness. These intermediate types are in the majority so they come up most often.
What some clients really want is a one of the sub-set of patterns that appear chaotic. Is there a way to favour generation of these patterns? Press the 'Scatter' button and you will see there is.
The macro behind this button begins by making a list of all the empty positions to be tiled. The list is shuffled using the same method as before and the area is tiled in the random order so produced.
By itself this would not reduce clustering. It is just a different way of generating the same truly random patterns as the first button, warty space invaders and all.
The twist is that at each position the program counts the colours of nearby tiles. A random choice is made from among the least popular colours. If the stock of the chosen colour has been used up then an alternative unpopular choice is made.
Here, the program would see that the position about to be tiled has two orange neighbours, one yellow and one green. Its first choice would be a random selection from the locally unrepresented blue and purple colours. If stocks of both were gone, it would make a random choice between the next most unpopular colours, yellow and green. Should they be used up also, orange would then be selected.
The effect of this local survey is to increase colour diversity around a tile, thus decreasing clumping. The brain's forte is identifying patterns so even with this improved algorithm you will see artefacts, though fewer than before and not so large. Increasing the number of colours helps. Six colours seems to be the minimum required to produce consistently good results with the Scatter algorithm.
Another variable you can play with is the size of the area examined around each tile. This is the value marked Scope, and note that it is not used by the first, truly random algorithm. A scope of 1 means the program examines immediately adjacent tiles – a 3 x 3 block. Upping the value to 2 extends the reach by another tile in every direction, giving a block size of 5 x 5, and so on.
Pushing the scope value upwards without also increasing the number of available colours increases local clustering. This is because examining a larger number of tiles reduces the possibility of making a unique choice.
The program run by the 'Scatter+' button is a high-strength version of Scatter. It is hard-wired with a scope of 2 and takes account of a tile's relative position as well as its colour.
The influence of the four immediately adjacent horizontal and vertical tiles is magnified by a weighting system. It is much less likely that a tile of the same colour as one in those positions will be selected. Push this improved program into a corner by choosing a low number of colours and it tends to favour diagonal lines. On five colours Scatter+ generates good patterns where Scatter exhibits clumping.
You may find a pattern that's almost right except for a little tweaking. The worksheet has provision for this, allowing manual swapping of tiles. Select one tile, [Ctrl]-click the other and press the Swap button.
The spreadsheet and writing above were dreamt up in the early 2000's as a quick answer to a problem.
Many people have since used the program to help with their tiling. It is a strange form of immortality because there's a little bit of me spread thinly over hundreds or maybe thousands of walls and floors the world over.Who walks on me now? Who weeps or pees, frolicks or flees? I'll never know.
I gave random tile colouring little further attention until Henrik Lieng, a computer scientist then at Cambridge University, alerted me to research he had worked on. It combines ideas from Gestalt psychology with a rigorous analytical treatment. You can read the paper here.
I am now aware of a wider academic effort that covers a variety of approaches to random colouring and cousin problems that depend on the same underlying mathematical properties. This still isn't fully solved in the sense that although effective approaches are known, there is room for improvement – more speed, fewer artefacts at small and large scales.
Insight into this rich subject can be found on this page (scroll well down for the tiling) as well as the Cambridge paper and their references to other work.
One of the interesting points I took from this is that clumping effects should be considered when tiles are similar in colour, not only when they are identical.
So how can a program know how similar two colours appear to be? Research has been done on that too, and we can put a number on it.
If you want to know about that, Google around the CIE LAB colour space and colour deltas. There is a .Net library on GitHub, ColorMine, that makes light work of converting RGB colours to LAB and working out the distances between them. Thanks to Joe Zack for that.
My Excel code does not assess colour similarity. It assumes every stock item is not only different, but equally different. The code produces reasonable results without colour awareness but it is clear that much could be done to improve it and introduce new ideas.
The spreadsheet pushes random tiling as far as I want to go with VBA macros. Excel was a convenient place to trial an idea but I don't like it for significant projects that aren't really spreadsheets. VBA lacks features of modern languages and frameworks that enable you to focus more on what you're trying to achieve and less on writing code to cover standard computational building blocks.
I have been experimenting with a standalone program (WPF / C# – so much nicer) that takes advantage of ColorMine to scatter tiles based on colour differences. It looks promising…
…but the user interface is not polished enough for public consumption (so no download), nor is the code optimised for speed or output quality. Hence it is slow and a few artefacts can be seen.
Can you spot them? I don't just mean the odd places where neighbouring tiles have the same colour. Lean back and squint so that edges blur. Focus on the general impression of colour in different areas. You should see the arrangement is slightly patchy – some areas darker, others paler, according to the local blend of colours.
The patches are low-frequency effects. My algorithm is fairly good at avoiding short-range (high frequency) colour collisions but less distinct clustering emerges at longer distances.
Better patterns will be closer to the characteristics of blue noise. Visible low-frequency effects in the above example show that it is some way from ideal if we seek pattern-free arrangements.
I believe the reason is that I am using a hybrid approach in which the area is seeded with random choices. Colour repulsion is used locally when there are neighbours to consider. The early free choices steer the arrangement towards normal random clustering. This gets blurred when gaps are filled based on colour avoidance.
An improved approach may be to incorporate the method described in this paper by Li-Yi Wei, entitled Multi-Class Blue Noise Sampling. It talks about spacing colours in a way that exhibits blue noise characteristics.
So there is an endless fund of potential work in this! If ever the program gets to a presentable state I will post it here.
Version 3.11, January 2015
Compatibility Windows versions of Excel 97 onwards. Reported to fail on the Mac version.
Description / note Ensure macros are enabled if prompted when loading the file. This version fixes an error with the checkbox.
I offer professional services in software development, databases, data manipulation and writing. Find out more at IanSharpe.com.
IanSharpe.com and other sites I manage are hosted at DigitalOcean. I am happy to recommend this company for Linux VPS hosting.
I get a commission if you open an account through this link and stay a while. DigitalOcean will credit $10 when you add a payment method. That gets you a generous slab of Linux-based VPS hosting for nothing and helps support this site.
This is the personal web site of Ian Sharpe, a software developer and writer based in Bath, United Kingdom.
Before taking up software development full-time I had a career in computer magazine publishing. A hobbyist obsession with programming diverted me into magazines from an even earlier career in railway civil engineering. That was in 1986 and for the next 16 years I was a writer and editor on high-profile titles. Some visitors may remember me from publications of yore such as PC Plus, PC Answers, PC Today and CPC Computing.
Much of the material on this site originates from that period and was written to fill column inches to short dealines. I created the software as an adjunct to the writing so not a lot time was spent on it!
There used to be a lot more (I reckon I wrote over a million words over my magazine career) but I drop items that become too outdated. The remainder survives while it stays popular, which remarkably it does.
And so the articles do not represent current interests or recent experience although I occasionally refresh them with new material.
There is no binding theme, just whatever came into my head to publish. Hence '@random'. That said, several pages deal with aspects of randomness. I don't know how that came about and it certainly wasn't a plan.
Some of my publishing experiences were exceptionally satisfying and many were great fun. But the world changed and the tide turned against the big-circulation magazines of the eighties and nineties. The gravy train was running out of juice and it was time to disembark, re-invent and move on.
I had an ambition to write software professionally. I also had some ability, always having done it as a sideshow to my writing. I got my first magazine job partly on the strength of my ability to crank out publishable code in Z80 assembly language and Basic. Over the years I edited reams of programming tutorials in a variety of languages written by some very capable people.
So I switched career once more and spent five years as a developer at a busy digital printing outfit.
Now I work freelance, dividing my time between projects at local technical software/electronics company Dot Software and whatever else I am engaged in. My professional services site is www.iansharpe.com.
Because my personal site is mostly old material, writing emails about it is not my favourite activity. Even 'quick' questions can take longer to answer than you might expect. It feels rude not to reply and sometimes I will respond, but forgive me if I do not.
If there is a real problem (or opportunity!) my address is email@example.com.
The download is in progress and should be visible soon.