By bramas  6 Mar 2018
Hi everyone! I am an associate professor in a french university (Strasbourg) and I started to work on distributed ledger technologies recently. After a lots of reading I found the tangle very interesting and perfect to start some new research because it's in the early stage, at least in the research part. I'm not really interested in the IOTA implementation but more on the tangle itself, so I hope I'll not create FUD and I want to stay away from the twitter wars or thing like that. I'm also interested in other technologies (cardano, \bthat who cannot be named in here\b, etc.) but I have no personal interests in any particular crypto currency. My goal is to define some models (starting with simple ones) and prove formal results according to them in order to improve the protocol.
I started writing a paper on the tangle. I wrote it relatively quickly because I want to present a short version of it in a french conference, but I will improve it and send it to peerreviewd conferences in the distributed computing community. Here is the first version of my paper: https://hal.archivesouvertes.fr/hal01716111v1
it contains two results :  an analysis of the average number of tips over time that corrects the mistakes I found in the white paper.  a proof that under the given model, proof of work is necessary to secure the tangle. i.e., To be secure, an attacker should not have a greater hashing power than what's currently *used* by honest nodes. The last point was mentionned many times on several forums but every time the answers were not clear whether this was necessary or not, so here's a proof. It can be important for the IOTA implementation as it means that something should be done before the Coo is removed. For instance: insensitive the honest nodes so that they always use fully their hashing power to issue new transaction (even empty ones).
Of course I'm open to discussion. Quentin.

By mthcl  7 Mar 2018
Hi Quentin, thanks a lot for your work! I did generally like it, and indeed the proofs that the attacker's hashing power should be less than the "active" hashing power of the honest nodes are valuable. Regarding the mistakes in my paper, there were indeed some in earlier versions, in particular, regarding the mean number of tips for the "uniform" tip selection algorithm; however, the later versions are correct, I think. Note that you (as well as Bartosz, ref. [6]) use the discretetime version of the model, while I'm considering a continuoustime one. It is quite natural then that the two models do have quantitative differences. Also, let me note that, in my paper, the assumption that the honest nodes have more (active) hashing power was meant to be a standing assumption throughout the paper; sorry for not making that sufficiently clear. And, I must confess that I don't understand completely your proof of Theorem 1. Aren't you substituting (implicitly) the number of new transactions by its expectation? At least note that 2\lambda need not be integer...
Again, thank you for your interest and your work, and hope to have many fruitful discussions with you in the future.

By bramas  7 Mar 2018
Hi Serguei, Thanks for your comments. You are right I missed the fact that in your analysis you used continuous time. I will mention it in my paper. In theorem 1, yes I substituted lambda(t) by its average lambda. You are right I should be more careful here as the case lambda not an integer is not clear and not well defined.
In your analysis, even though you use continuous time, there are some formula that I don't understand: "Now, think about a new transaction that comes at this moment; then, a transaction it chooses to approve is a tip with probability r/(r + ?h) (since there are around r tips known to the node that issued the transaction, and there are also around ?h transactions which are not tips anymore, although that node thinks they are), so the mean number of chosen tips is 2r/(r + ?h)."
The first probability should be r/(r + C(r,2?h)) I think since around ?h sites were issued since time th, with C(r,2?h) being the number of nonempty bins when throwing 2?h in r bins. Again, I do not think "the mean number of chosen tips" is simply two times this number as two tips confirms the same site. Maybe there is something I missed related to the continuous time setting again.
For the assumption about the honest nodes and their use of their hashing power, maybe you can add a precision in the white paper after the introduction of MCMC but my paper was not against that (its not written in the white paper that MCMC solves everything). My paper is just here to prove formally that this assumption is necessary for any tip selection algorithm. That is something that is quiet intuitive (and you said it in the white paper) but I wanted to tackle the problem with the most general approach. Actually the motivation was more from what I read on several forums about people saying that the PoW is not here to secure the protocol. Now I make it clear that with tip selection algorithm based on weight like MCMC, the PoW done by honest nodes is used to secure the protocol.
Thanks again for your comment and for your white paper, I think there are a lots of interesting questions about the tangle and I'm happy to work on it. I will have a Master thesis student working on it in few weeks.

By enis.olgac  7 Mar 2018
Thanks for the really good work. For me, it opened another sight. Especially the definition of “Conflicting Transactions”. In order to reinforce my thoughts further, I need answers to couple of questions, though: • Does a “sender address” mean it is, or can be mapped to a unique id within tangle? • Can one track all transactions of a “sender” within tangle? • Same questions for “receiver address”. • You say “Two sites are conflicting if they try to move the same funds to two different receivers, i.e. if both executing transactions results in a negative balance for the sender.” Is it correct that “same funds” does not mean exactly the same amount or identical transactions, except “receiver address”? • Are “Conflicting Transactions” and “Double Spending” more or less synonyms? It would be a great, if you can help. Thanks in advance.

By bramas  7 Mar 2018
Hi, You're right I wasn't very clear about what is a conflicting transaction but that is because the exact definition is not important in my paper. That could depend on the implementation. In my definition of conflicting transactions, you're right, I thought about something allowing double spending. Same funds again may depends on the implementation. a fund can be the output of a previous transaction, in this case two transactions are conflicting if they use as input the same previous output (If I'm not mistaken that's close to the current implementation) but one can imagine allowing the same previous output for the input of a transaction, in this case, two transactions are conflicting if the sum of their input is greater than the previous output.
I did not looked into the way addresses are used in the tangle. I tried to stay as far as possible as the implementation. I stayed away from everything used by currencies like the fact that public keys represent addresses and that private keys are used to sign transactions.

By mthcl  7 Mar 2018
+xIn your analysis, even though you use continuous time, there are some formula that I don't understand: "Now, think about a new transaction that comes at this moment; then, a transaction it chooses to approve is a tip with probability r/(r + ?h) (since there are around r tips known to the node that issued the transaction, and there are also around ?h transactions which are not tips anymore, although that node thinks they are), so the mean number of chosen tips is 2r/(r + ?h)." The first probability should be r/(r + C(r,2?h)) I think since around ?h sites were issued since time th, with C(r,2?h) being the number of nonempty bins when throwing 2?h in r bins. Again, I do not think "the mean number of chosen tips" is simply two times this number as two tips confirms the same site. Maybe there is something I missed related to the continuous time setting again. That argument in the WP is not rigorous (note that I didn't state it as theorem), and I think I was implicitly assuming that ?h is large (as it should be in practice). It's confirmed by (continuoustime model) simulations, though.
It could be a good task for a Master student, to write it down in a rigorous way

By bramas  8 Mar 2018
No actually you were 100% right. I did not understand the part "r/(r + ?h)" but it's true. In the continue time setting there's no "balls into bins" game like I did for the discretetime setting. In continuous time each newly issued site confirms 2r/(r + ?h) tips in average so the remaining is correct. So I was clearly confused by the continuous time setting. (I'm glad my analysis for the discrete time is still correct though ) I'll will update the mention to your work in my paper accordingly soon.
Thanks for your help.
I asked other questions on discord about the MCMC. Have you started working on a formal analysis of the number of tips with other tip selection like MCMC or x>x^a MCMC?

By mthcl  8 Mar 2018
+xHave you started working on a formal analysis of the number of tips with other tip selection like MCMC or x>x^a MCMC? It seems to be too difficult to prove anything formally for that MCMC (unless you really have simple [=unweighted] RW towards the tips, but even then it's tricky). There are plenty of simulations being done, though.

