<?xml version="1.0" encoding="UTF-8" ?>
  <resource>
  <id>5937</id>
  <path>/www/nrich/html/content/id/5937/</path>
  <resourceTypeID>1</resourceTypeID>
  <last_published>2011-02-01T00:00:01</last_published>
  <indexXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
&lt;ul id=&quot;stemLinks&quot;&gt;
&lt;li&gt;&lt;a href=&quot;http://nrich.maths.org/6496&quot;&gt;Warm-up&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://nrich.maths.org/6569&quot;&gt;Try this next&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://nrich.maths.org/2370&quot;&gt;Think higher&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://plus.maths.org/content/os/issue9/features/benford/index&quot;&gt;Read: mathematics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://plus.maths.org/content/origins-fractals&quot;&gt;Read: science&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Scale_invariance&quot;&gt;Explore further&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt; &lt;/div&gt;
&lt;br&gt;&lt;/br&gt;
&lt;p&gt;&lt;span style=&quot;font-style: italic;&quot;&gt;In 1881 an astronomer, Newcomb, first noticed a very bizarre property of some naturally occurring sets of numbers: if you list the surface areas of all the rivers in a country, about&lt;/span&gt; $30\%$ &lt;span style=&quot;font-style: italic;&quot;&gt;of them are numbers that have&lt;/span&gt; $1$ &lt;span style=&quot;font-style: italic;&quot;&gt;as a first digit, about&lt;/span&gt; $18\%$ &lt;span style=&quot;font-style: italic;&quot;&gt;have&lt;/span&gt; $2$ &lt;span style=&quot;font-style: italic;&quot;&gt;as a first digit and so on, with only about&lt;/span&gt; $5\%$ &lt;span style=&quot;font-style: italic;&quot;&gt;of them having&lt;/span&gt; $9$ &lt;span style=&quot;font-style: italic;&quot;&gt;as a first digit.What&amp;#39;s more, if you convert the lengths into any other unit (miles, feet, mm, etc) the distribution of first digits remains the same (we say, the distribution
is &amp;#39;scale invariant&amp;#39;.) The same pattern of first digits, occurs in many sets of seemingly random numbers. It is called Benford&amp;#39;s Law, after its second discoverer physicist Frank Benford, working in 1938. In this problem we shall use probability to predict the numbers observed by Newcomb.&lt;/span&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;span style=&quot;font-style: italic;&quot;&gt;You will need to know that a function f(x) is called &amp;#39;scale invariant&amp;#39; if scaling x by a fixed amount does not change the shape of the function. Mathematically, the property of scale invariance is written as: f(Ax) = k f(x) for fixed numbers A and k&lt;/span&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Show that if a probability density function $f(x)$ with $x&amp;amp;gt; 0$ is scale invariant then&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
$f(Ax) = f(x) / A$&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Can a function $f(x)$ be both scale invariant and a probability density function if $x$ is allowed to take any non-negative value? Experiment with various forms of $f(x)$ to try to find out.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
How would your results change if $f(x)$ was restricted to take values a&amp;amp;lt; x&amp;amp;lt; b, for some positive numbers $a$ and $b$?&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Suppose that $a = 1$ and $b = 1,000,000$. Which of the functions will make a scale invariant probability density function? For this density, show that $$P(1&amp;amp;lt; x&amp;amp;lt; 2) = P(100&amp;amp;lt; x&amp;amp;lt; 200) =P(1000000&amp;amp;lt; x&amp;amp;lt; 2000000)$$ Suppose that a number X is drawn randomly from this distribution. Calculate the probability that its first digit is$1$. Extend this to calculate the probability that the
first digit is $2, 3, 4, ...., 9$. How would these results change if $b$ was $1,000,000,000$ or $1,000,000,000,000$?&lt;/p&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;div class=&quot;framework&quot;&gt;NOTES AND BACKGROUND&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
You might like to consider what sorts of random phenomena might give rise to a scale invariant distribution? How would this relate to the units used to make a measurement? Find some random real-world data in a book and tabulate their first digits. What do you notice?&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
It is worth noting that an excellent solution was sent in to this problem. This is well worth a read; see the solution tab above.&lt;/div&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</indexXML>
  <solutionXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
&lt;span class=&quot;editorial&quot;&gt;Peter Townsend succesfully solved this
fascinating problem, providing us with one of the best solutions
we've ever recieved. Awesome!&lt;/span&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;490&quot; height=&quot;528&quot; src=&quot;soln1.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;609&quot; height=&quot;400&quot; src=&quot;soln2.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;615&quot; height=&quot;535&quot; alt=&quot;&quot; src=&quot;soln3.jpg&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;584&quot; height=&quot;387&quot; src=&quot;soln4.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;598&quot; height=&quot;575&quot; src=&quot;soln5.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;mdo:image width=&quot;523&quot; height=&quot;318&quot; src=&quot;soln6.jpg&quot; alt=&quot;&quot;&gt;&lt;/mdo:image&gt;&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</solutionXML>
  <noteXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;

&lt;h3&gt;Why do this problem?&lt;/h3&gt;
This problem offers a fascinating exploration into probability
density functions for real world data. Whilst the individual steps
are quite simple, the problem draws together many strands from
distribution theory. The results can be tested on any set of data
from any geography book, giving an interesting relevance to the
mathematics.&lt;br&gt;&lt;/br&gt;

&lt;h3&gt;Possible approach&lt;/h3&gt;
&lt;div&gt;The first obstacle to overcome is that of notation: can the
students understand what is being asked?&lt;/div&gt;
&lt;br&gt;&lt;/br&gt;

&lt;div&gt;The question involves little computation but requires clear
thinking of the ideas. This might be facilitated in a group
discussion, but might also require individual work.&lt;/div&gt;
&lt;h3&gt;Key questions&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;If a function is to be a probability density function, what is
the major property it must possess?&lt;/li&gt;
&lt;li&gt;What ranges of values will start with a digit $1$?&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Possible extension&lt;/h3&gt;
Consider carefully why this problem involves 'scale invariance'.
Consider the restriction of scale invariance on real world data.
Which sets of real world data do you think will be modelled by this
distribution? Why? 
&lt;h3&gt;Possible support&lt;/h3&gt;
Skip the first part and provide students with the scale invariant
functions. Also, first use the range 1&amp;lt; x &amp;lt; 10 in the last
part of the question.&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</noteXML>
  <clueXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
Don't forget that probability distribution functions must integrate
to $1$ over the allowed range of values.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
Try changing variables for the first part.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
For the second part, note that clearly $x^2\rightarrow (ax)^2 \neq
a(x^2).$ &lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
How could you make the two sides match for other powers of
$x$?&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</clueXML>
  <canonXML>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;mdoxml version=&quot;1.0&quot;&gt;&lt;br&gt;&lt;/br&gt;
Change variables and use fact that the integral of the distribution
is $1$ to give result.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
The only scale invariant functions are $B/x$ for any constant
$B$&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
These cannot be distributions on non-negative numbers because the
integral on $[0, \infty]$ is '$log(0)$'&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
We can 'regularise' and integrate from $[a, \infty]$ : on this
range $1/(log(a) x)$ is a distribution.&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
The probabilities are equal to&lt;br&gt;&lt;/br&gt;
&lt;br&gt;&lt;/br&gt;
 
&lt;table border=&quot;1&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;First digit&lt;/td&gt;
&lt;td&gt;Probability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;0.301&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0.176&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;0.125&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;0.097&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;0.079&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;0.067&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;0.058&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;0.051&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;0.046&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;br&gt;&lt;/br&gt;&lt;/mdoxml&gt;</canonXML>
  <end_user_role>2</end_user_role>
  <difficulty>5</difficulty>
  <keystage1>0</keystage1>
  <keystage2>0</keystage2>
  <keystage3>0</keystage3>
  <keystage4>0</keystage4>
  <keystage4plus>1</keystage4plus>
  <title>Scale invariance</title>
  <description>By exploring the concept of scale invariance, find the probability
that a random piece of real data begins with a 1.</description>
  <spec_group>Probability
    <specifier>Probability</specifier>
  </spec_group>
  <spec_group>Advanced Probability and Statistics
    <specifier>Probability density functions</specifier>
  </spec_group>
  <spec_group>Applications
    <specifier>Maths Supporting SET</specifier>
  </spec_group>
  <spec_group>Using, Applying and Reasoning about Mathematics
    <specifier>Investigations</specifier>
  </spec_group>
</resource>