find + mkdir is Turing complete (retracted)
The proof is flawed and I retract the claim that I proved that find + mkdir is Turing complete. See https://news.ycombinator.com/item?id=41117141. I will update the article if I could fix the proof.
> In Windows, folders are entirely free in terms of disk space! For proof, create say 352,449 folders and get properties on it.
To be that guy for a moment: well, hackchewally…¹
Directory entries do take up space in the MFT, but that doesn't show up in explorer which is only counting allocated blocks elsewhere. You will eventually hit a space issue creating empty directories as the MTF grows to accept their allocation.
You can do similar tricks with small files. Create an empty text file and check, it will show 0 bytes size and 0 bytes on disk. Put in ~400 bytes of text and check again: explorer will show 400 bytes in length but 0 size on disk because the data is in the directory entry in the pre-allocated MFT. Double up that data, and it will be big enough that a block on the disk is allocated: in properties in Explorer you'll now see 800 bytes length and 4,096 bytes (one block) on disk. Drop it back to 400 bytes and it won't move the data back into the MFT, you'll now see 400 bytes length, 4096 bytes consumed on disk.
--
[1] though don't let this put you off enjoying the splendid thing overall!
I don't understand how this shows Turing completeness. The implementation of the rule 110 automaton seems to be limited by both width (not Turing complete because there is a finite number of states of a given width) and iteration limit (not be Turning complete because it always terminates).
Can you write an implementation of rule 110 with arbitrary (i.e. unbounded) width and depth?
It's still ok if the implementation limits it rather than the concept. I mean, your computer has finite memory rather than infinite tape, so it doesn't meet that requirement either regardless of language/method.
> The proof is flawed and I retract the claim that I proved that find + mkdir is Turing complete. See https://news.ycombinator.com/item?id=41117141. I will update the article if I could fix the proof.
I thought that this was going to use some interesting form of lambda calculus but instead it simply relies on the regex parser of find to compute things.
Observation: Any piece of software/service or piece of software/service used in a software/service chain which implements and/or consumes Regular Expressions (aka RE's, RegExp's) -- is potentially Turing Complete, and should be audited for Turing completeness if security in that context is a concern...
I have found interesting this in the parent article:
"The proof leverages a common technique: showing the system can execute Rule 110."
because I was not aware about "Rule 110".
Nevertheless, reading the Wikipedia page about "Rule 110", I find it astonishing that "Rule 110" not only has been the subject of a research paper, but that paper has been even the ground for a legal affair based on a non-disclosure agreement with Wolfram Research, which has blocked the publication of the paper for several years.
The demonstration that "Rule 110" is capable of universal computation is completely trivial and it requires no more than a sentence. It cannot be the subject of a research paper of the last decades.
There are several known pairs of functions that are sufficient for computing any Boolean functions, for example AND and NOT, OR and NOT, OR and XOR, AND and XOR. The last pair is a.k.a. multiplication and addition modulo 2.
Whenever there is a domain where all the possible functions can be expressed as combinations of a finite set of primitives, it is also possible to express all the members of the finite set of primitives by using a single primitive function that combines all the other primitives in such a way that composing that function with itself in various ways can separate each of the original primitives from the compound primitive.
Applying this concept to Boolean functions it is possible to obtain various choices for a single primitive function that can generate all Boolean functions, for instance NAND, which combines NOT and AND or NOR, which combines NOT and OR.
In general all the ado about how various kinds of computational domains can be reduced to a single primitive function is not warranted and it is not interesting at all. The reason is that such combined primitives do not change in any way the actual number of primitives. They just replace N distinct simple primitives with 1 compound primitive that must be used in N distinct ways. This does not change in any way the complexity of the domain and it does not make it easier to understand in any way.
"Rule 110" is just another banal example of this technique. Like NAND combines NOT and AND in a separable way, "Rule 110" combines multiplication and addition modulo 2, a.k.a. AND and XOR, in a separable way. Therefore it can express any Boolean function, therefore, by encoding, also any computable function.
There is absolutely no advantage in showing that some system can compute "Rule 110". It is simpler and clearer to show that it can compute AND and XOR, or AND and NOT.
As far as I can tell from your comment you have the terms "functional complete" and "Turing complete" confused. These are emphatically not the same thing.
A circuit of (e.g.) NAND gates defines a mathematical function over a fixed, finite number of variables (the number of input wires to your circuit) and with a fixed number of outputs (likewise the output wires).
A Turing complete computer accepts inputs which are unbounded in length, I.e. it accepts an input of at least length n for any natural number n. It can also output unbounded strings.
These two are fundamentally completely different. Functional completeness for a set of gates doesn't tell you much about Turing completeness. For all of the interesting stuff to do with Turing machines you need this unbounded input size so you can do things like consider descriptions of other Turing machines as inputs to your Turing machine.
Essentially what you need is something equivalent to looping or recursion. Note that the Halting problem is completely trivial for NAND circuits, exactly because there is no looping.
I believe that if you could also move and link files, you could actually simulate lambda calculus with a similar technique. I imagine something like this would work, where applications are described by shared prefix in same directory depth and order of application is encoded in lexicographical name order:
λx.x:
$ tree .
.
└── x
└── a -> ../x/
λsz.(s (s (s z))):
$ tree .
.
└── s
└── z
├── a -> ../../s/
├── b -> ../../s/
├── ca -> ../../s/
└── cb -> ../z/
From the top of the page:
find + mkdir is Turing complete (retracted) The proof is flawed and I retract the claim that I proved that find + mkdir is Turing complete. See https://news.ycombinator.com/item?id=41117141. I will update the article if I could fix the proof.
It's already fixed.
So can you implement Folders with it?
https://www.danieltemkin.com/Esolangs/Folders/
> In Windows, folders are entirely free in terms of disk space! For proof, create say 352,449 folders and get properties on it.
To be that guy for a moment: well, hackchewally…¹
Directory entries do take up space in the MFT, but that doesn't show up in explorer which is only counting allocated blocks elsewhere. You will eventually hit a space issue creating empty directories as the MTF grows to accept their allocation.
You can do similar tricks with small files. Create an empty text file and check, it will show 0 bytes size and 0 bytes on disk. Put in ~400 bytes of text and check again: explorer will show 400 bytes in length but 0 size on disk because the data is in the directory entry in the pre-allocated MFT. Double up that data, and it will be big enough that a block on the disk is allocated: in properties in Explorer you'll now see 800 bytes length and 4,096 bytes (one block) on disk. Drop it back to 400 bytes and it won't move the data back into the MFT, you'll now see 400 bytes length, 4096 bytes consumed on disk.
--
[1] though don't let this put you off enjoying the splendid thing overall!
This is phenomenal
I don't understand how this shows Turing completeness. The implementation of the rule 110 automaton seems to be limited by both width (not Turing complete because there is a finite number of states of a given width) and iteration limit (not be Turning complete because it always terminates).
Can you write an implementation of rule 110 with arbitrary (i.e. unbounded) width and depth?
It's still ok if the implementation limits it rather than the concept. I mean, your computer has finite memory rather than infinite tape, so it doesn't meet that requirement either regardless of language/method.
C is maybe technically not Turing compete either: https://cs.stackexchange.com/questions/60965/is-c-actually-t...
The author has since updated their post:
> The proof is flawed and I retract the claim that I proved that find + mkdir is Turing complete. See https://news.ycombinator.com/item?id=41117141. I will update the article if I could fix the proof.
Retracted >The proof is flawed and I retract the claim that I proved that find + mkdir is Turing complete
I thought that this was going to use some interesting form of lambda calculus but instead it simply relies on the regex parser of find to compute things.
Not the first time someone has coerced a regex into doing some nontrivial computation; here's a memorable example:
http://realgl.blogspot.com/2013/08/battlecode.html (scroll down to "Regular Expression Pathfinding"
I suspect this proof could be greatly simplified by the use of tag systems (https://en.m.wikipedia.org/wiki/Tag_system) rather than cellular automata.
Of course no implementation is infinite, but in this case PATH_MAX with 4096 as a typical value seems particularly low.
Check out section “Expected questions and answers”. For GNU it seems to work with path lengths larger than 4096.
The blog post addresses this by using relative paths. Tested up to a path length of 30k apparently.
Observation: Any piece of software/service or piece of software/service used in a software/service chain which implements and/or consumes Regular Expressions (aka RE's, RegExp's) -- is potentially Turing Complete, and should be audited for Turing completeness if security in that context is a concern...
Speaking strictly, the original definition of Regex required only a finite state machine with zero stacks.
You need 2 stacks for Turing completeness.
Tho a lot of regex libraries can support much more than just “regex”
My find is not only provably Turing complete, but not even a Turing tarpit and compiles to native code.
What's the implication of this for users of these commands?
Not much practically other than watch out for non-terminating find queries.
well, it also supports `-exec` so it would imply rather a big problem with the host OS if that wasn't turing complete
Why would an OS need to be Turing complete?
Neat now we just need a compiler for a scripting language...
Check awka for posix awk.
I have found interesting this in the parent article:
"The proof leverages a common technique: showing the system can execute Rule 110."
because I was not aware about "Rule 110".
Nevertheless, reading the Wikipedia page about "Rule 110", I find it astonishing that "Rule 110" not only has been the subject of a research paper, but that paper has been even the ground for a legal affair based on a non-disclosure agreement with Wolfram Research, which has blocked the publication of the paper for several years.
The demonstration that "Rule 110" is capable of universal computation is completely trivial and it requires no more than a sentence. It cannot be the subject of a research paper of the last decades.
There are several known pairs of functions that are sufficient for computing any Boolean functions, for example AND and NOT, OR and NOT, OR and XOR, AND and XOR. The last pair is a.k.a. multiplication and addition modulo 2.
Whenever there is a domain where all the possible functions can be expressed as combinations of a finite set of primitives, it is also possible to express all the members of the finite set of primitives by using a single primitive function that combines all the other primitives in such a way that composing that function with itself in various ways can separate each of the original primitives from the compound primitive.
Applying this concept to Boolean functions it is possible to obtain various choices for a single primitive function that can generate all Boolean functions, for instance NAND, which combines NOT and AND or NOR, which combines NOT and OR.
In general all the ado about how various kinds of computational domains can be reduced to a single primitive function is not warranted and it is not interesting at all. The reason is that such combined primitives do not change in any way the actual number of primitives. They just replace N distinct simple primitives with 1 compound primitive that must be used in N distinct ways. This does not change in any way the complexity of the domain and it does not make it easier to understand in any way.
"Rule 110" is just another banal example of this technique. Like NAND combines NOT and AND in a separable way, "Rule 110" combines multiplication and addition modulo 2, a.k.a. AND and XOR, in a separable way. Therefore it can express any Boolean function, therefore, by encoding, also any computable function.
There is absolutely no advantage in showing that some system can compute "Rule 110". It is simpler and clearer to show that it can compute AND and XOR, or AND and NOT.
As far as I can tell from your comment you have the terms "functional complete" and "Turing complete" confused. These are emphatically not the same thing.
A circuit of (e.g.) NAND gates defines a mathematical function over a fixed, finite number of variables (the number of input wires to your circuit) and with a fixed number of outputs (likewise the output wires).
A Turing complete computer accepts inputs which are unbounded in length, I.e. it accepts an input of at least length n for any natural number n. It can also output unbounded strings.
These two are fundamentally completely different. Functional completeness for a set of gates doesn't tell you much about Turing completeness. For all of the interesting stuff to do with Turing machines you need this unbounded input size so you can do things like consider descriptions of other Turing machines as inputs to your Turing machine.
Essentially what you need is something equivalent to looping or recursion. Note that the Halting problem is completely trivial for NAND circuits, exactly because there is no looping.
I believe that if you could also move and link files, you could actually simulate lambda calculus with a similar technique. I imagine something like this would work, where applications are described by shared prefix in same directory depth and order of application is encoded in lexicographical name order:
λx.x:
λsz.(s (s (s z))):