Thursday, November 24, 2016

Algorithm to calculate Inverse Factorial of Natural Numbers

Note:

  1. All numbers are Natural numbers. Unless until specified
  2. This works if the inverse factorial of the number is guaranteed to be a natural number. If you came here hoping to get gamma or pi function based solutions, Sorry turn back.
Calculating factorial is hard. But is there a way one can speedily calculate inverse factorial?. Suppose you were given a number that was promised to be factorial of a whole number will you able to tell the inverse factorial of that number. The post presents two observations on factorials and then formulates an algorithm to use these observations to perform the inverse of a factorial.

Observation 1 [Counting trailing zeroes]

Observation: All the factorials after 5! end in zeroes, i.e. The factorial has a lot of trailing zeroes at the end.

First step was to gather data. And to make matters simple I wrote a python script to plot the number of zeroes in factorial of the given number vs the number.

The graph is linear. Which is good!!. Now maybe we could find some relation to the number of zeroes in factorial of the number and the number itself. Just to make sure I also ran it from 0 to 20000. 



No surprises there. 
You can try the same using this python script.

How are the zeroes in the factorial related to the number itself?

This is simple a factorial is simply a product of all the natural numbers from 1 to that number. So a zero occurs every time we multiply a 10. Which means every time a 2x5 happens there is a zero. So there is a extra zero every 5 numbers. We don't have to worry about availability of '2's as they are in abundance (By the time you gain a factor of 5 we cross two even numbers).

[0] :: 0!   -> 1
[0] :: 1!   -> 1
[0] :: 2!   -> 2
[0] :: 3!   -> 6
[0] :: 4!   -> 24
[1] :: 5!   -> 120
[1] :: 6!   -> 720
[1] :: 7!   -> 5040
[1] :: 8!   -> 40320
[1] :: 9!   -> 362880
[2] :: 10! -> 3628800
[2] :: 11! -> 39916800
[2] :: 12! -> 479001600
[2] :: 13! -> 6227020800
[2] :: 14! -> 87178291200
[3] :: 15! -> 1307674368000
[3] :: 16! -> 20922789888000
[3] :: 17! -> 355687428096000
[3] :: 18! -> 6402373705728000
[3] :: 19! -> 121645100408832000
[4] :: 20! -> 2432902008176640000

It is clear that the pattern seems to follow. So the graphs look linear for big data. But are actually stepped. So generating the same graph for 0 to 100 we get

Now this is interesting. our hypothesis seems to hold, somewhat. But there seems to be no factorial with 5 zeroes at all, and there are these weird places where the zeroes seem to increase an extra digit. In fact 24! -> 620448401733239439360000 and  25! -> 15511210043330985984000000, from 4 zeroes to 6 zeroes. Lets zoom out a little.


As you can see the jumps seem to happen at factors of powers of 5. Or one zero jump at 25! 50! 75! and two zero jump at 125! 250! 375! etc. Lets plot the increase in number of zeroes.
For a larger data set



Try it out using this python script.

So this is cool. Lets try and explain. Our hypothesis that for every 2x5 there is a zero is correct. But that doesn't mean that for every 5 numbers there is a jump in number of zeroes of the factorial of the number. Lets see 24 has 4 '5's which are at 5, 10, 15, and 20, But 25 has 6 '5's at 5, 10, 15, 20 and two at 25 because 25 = 5x5. So given a number if we can calculate number of 5's that make the factorial of the number we can say the number of zeroes that the number makes. 

Formulating it given a number n

sum of (n/k) where {k = 5^m, m = 1,2,3.... | k < n}. ('/' signifies integer division)

For n = 26 we get (26/5) = 5 and (26/25) = 1. We cannot do (26/125) as 125 is > 26. So we can conclude that as 5+1 = 6. 26! has 6 zeroes. 

Doing the same for 80! we get (80/5) = 16, (80/25) = 3. Therefore 80! must have 19 trailing zeroes.

80!=71569457046263802294811533723186532165584657342365752577109445058227039255480148842668944867280814080000000000000000000

Which is correct. 

Lets summarize,

Given a number we can safely say how may zeroes that the factorial might have. Which is done by the formula above.

So given a factorial of a number, By counting the number of zeroes one must be able to say that the number is within these 5 numbers, as all of them have factorials with same number of zeroes. But how does one work this in reverse. 

Let try to figure out the number given that factorial of that number has 44 trailing numbers. 

Now we know that 44 was written as a + b + c .... , where a is number of factors of 5, b of 25 ...

So we can work reverse 44/5 = 8, 44/25 = 1. Therefore 44 - 8 = 36 must be the number of '5' factors and 8 - 1 = 7 must be the number of factors of '25' and 1 must be the number of factors of '125'

Therefore number must be somewhere between 36*5 = 180 and 37*5 = 185(not inclusive). So the factorials of numbers 180 to 184 have 44 trailing zeroes  

180!=200896062499134299656951336898466838917540340798867777940435335160044860953395980941180138112097309735631594101037399609671032132186331495273609598531966730972945653558819806475064353856858157445040809209560358463319644664891114256430017824141796753818192338642302693327818731986039603200000000000000000000000000000000000000000000

181!= 36362187312343308237908191978622497844074801684595067807218795663968119832564672550353604998289613062149318532287769329350456815925726000644523337334285978306103163294146384971986648048091326497552386466930424881860855684345291680413833226169665212441092813294256787492335190489473168179200000000000000000000000000000000000000000000

182!=6617918090846482099299290940109294607621613906596302340913820810842197809526770404164356109688709577311175972876374017941783140498482132117303247394840048051710775719534642064901569944752621422554534336981337328498675734550843085835317647162879068664278892019554735323605004669084116608614400000000000000000000000000000000000000000000

183!=1211079010624906224171770242040000913194755344907123328387229208384122199143398983962077168073033852647945203036376445283346314711222230177466494273255728793463071956674839497876987299889729720327479783667584731115257659422804284707863129430806869565563037239578516564219715854442393339376435200000000000000000000000000000000000000000000

184!=222838537954982745247605724535360168027834983462910692423250174342678484642385413049022198925438228887221917358693265932135721906864890352653834946279054097997205240028170467609365663179710268540256280194835590525207409333795988386246815815268464000063598852082447047816427717217400374445264076800000000000000000000000000000000000000000000

One has to be careful while evaluating these. Because, take for example you find 10 zeroes this corresponds to 9 + 1 not 10 - 10/5 = 8 and 10/5 = 2 therefore 8+2. The best way to evaluate this, that I could come up with is to take an array where 0 index corresponds to number of '5's and 1 corresponds to number of '25's and so on and populate it. Check the final code for my answer

Now we can point out the number from its factorial to the accuracy of 5. But how do we get to the exact number.

Observation 2 [Number of digits in the number]

The number of digits of a factorial of a number increases with the increase in the number. This was obvious, But first the data


It is steadily increasing. Though not linear. Trying the same on a larger data set.

So, Its settled. The length of the factorial has something to do with the number. However one has to take into account that for smaller numbers this wouldn't work. 

And this seems to happen only below 10 twice. Shouldn't be a big deal as for these numbers the regular way of calculating the factorial and verifying would suffice and would be sufficiently fast enough.

Now the standard way of determining the number of digits in the number is to do a logarithm base 10 of the number and add one to the lower round off. Or to say

floor(log10(n)) + 1 (floor does the round off to lower number)

For example number of digits in 342345234 can be calculated by this method as log10(342345234) = 8.53446428616 so 8 + 1 = 9

Now to the magic part 'Stirling's Approximation'. It is written as 

ln(n!) = nln(n) - n , where ln is logarithm base e

So given n!, floor( log10(n!) + 1) gives us the number of digits in the number. 
Adapting the approximation formula. 

Let d be the number of digits in the factorial of a number n

d = floor( log10(n!) ) + 1
d = floor( ln(n!) / ln(10) ) + 1
d = floor( ( nln(n) - n ) / ln(10) ) + 1 

Can we approximate n from d?. I am not gonna try. But some careful observation should tell you that it might not simplify things.

Algorithm for Inverse Factorial

Now we can get back to the main problem. 
We can make guesses for 5 numbers and calculate the number of digits to pinpoint the number. As long as you can calculate ln(n) and ln(10) accurately we can approximate the number. And hence the inverse factorial.

However, its worth noting that it is all an approximation. And might not hold all too well for long. You might want to read through the error in stirling's approximation. But this must be accurate for quite large numbers. As the number of digits increase too rapidly in factorial function evident from the graph.

So for the final act lets put all together and guess the factorial of the number.

I wrote a python script to do all the simulations. Get it here.

Here are the results.
This is the error in determining the inverse factorial of the number vs the number. The script used to generate this is here.



And this shows the number vs the inverse factorial of the factorial of the number. You can try it out using this script.



I've tried this method up to 20000!. And does seem to do really well. However any bigger would heat up my computer crazy. But if you can, try and let me know.

Cheers!!


Note:
  1. This algorithm assumes that there exists an inverse factorial for the given number. However this might not be the case always.
  2. This only works for numbers written in base 10. Which is very inefficient. However it can be made to work for other base numbers with a few modifications.

No comments:

Post a Comment