Friday 2 November 2012

Random Shapefile Data

No comments:
I often need to quickly generate some random attribute values in a shapefile, either to test code, write an answer on GIS.SE or to mock up some data to test a workflow. There are a few ways to do this simply in pure python using the various random functions, but if all I need to do is populate an attribute table the ArcGIS field calculator will do the job.

The syntax of the command is very un-pythonic, but does allow for a range of different random datasets to be generated. I recently used this command to create some percentage data at work:


Which clearly creates random integers between 0 and 100. After reading through the helpfiles I found a list of all the different random distributions that can be used, which I have reproduced below for my own convenience:
  • UNIFORM {Minimum}, {Maximum}—A uniform distribution with a range defined by the {Minimum} and {Maximum}. Both the {Minimum} and {Maximum} are of type double. The default values are 0.0 for {Minimum} and 1.0 for {Maximum}.
  • INTEGER {Minimum}, {Maximum}—An integer distribution with a range defined by the {Minimum} and {Maximum}. Both the {Minimum} and {Maximum} are of type long. The default values are 1 for {Minimum} and 10 for {Maximum}.
  • NORMAL {Mean}, {Standard Deviation}—A Normal distribution with a defined {Mean} and {Standard Deviation}. Both the {Mean} and {Standard Deviation} are of type double. The default values are 0.0 for {Mean} and 1.0 for {Standard Deviation}.
  • EXPONENTIAL {Mean}—An Exponential distribution with a defined {Mean}. The {Mean} is of type double. The default value for {Mean} is 1.0.
  • POISSON {Mean}—A Poisson distribution with a defined {Mean}. The {Mean} is of type double. The default value for {Mean} is 1.0.
  • GAMMA {Alpha}, {Beta}—A Gamma distribution with a defined {Alpha} and {Beta}. Both the {Alpha} and {Beta} are of type double. The default values are 1.0 for {Alpha} and 1.0 for {Beta}.
  • BINOMIAL {N}, {Probability}—A Binomial distribution with a defined {N} and {Probability}. The {N} is of type long, and {Probability} is of type double. The default values are 10 for {N} and 0.5 for {Probability}.
  • GEOMETRIC {Probability}—A geometric distribution with a defined {Probability}. The {Probability} is of type double. The default value for {Probability} is 0.5.
  • NEGATIVE BINOMIAL {N}, {Probability}—A negative binomial distribution with a defined {N} and {Probability}. The {N} is of type long, and {Probability} is of type double. The default values are 10 for {N} and 0.5 for {Probability}.
There are, of course, many other ways to do this but I find that this weird little function does a pretty good job, once you get to grips with the odd syntax.