scriptkid

November 7, 2022

$$ \begin{aling} 3+2 =\ 5 \end{align} $$

At the St. Petersburg casino there's a game: starting with a prize of 2\$, a coin is tossed and each time the coin comes up tail the prize money doubles. When it comes up head you win the prize and the game ends.

The question is: how much money is fair to spend for taking part in such a game?

This seems like a paradox if we consider as “fair” estimating the expected value of the game and spending such amount. We can do so by multiplying each possible outcome by its probability of occurring

$$ 2 \cdot \frac{1}{2} + 4 \cdot \frac{1}{4} + 8 \cdot \frac{1}{8} + ... $$

Which is equals to evaluating the series

$$ \sum_{n=1}^{\infty} 2^n \cdot \frac{1}{2^n} $$

This is an infinite sum of 1s so it would seem reasonable to spend any finite amount of money to take part in such a game, a thing that many consider paradoxical given that the probability of winning back that same amount of money is very low.

Many proposed solutions involve not using the monetary value but rather a function of it, called utility function. A first attempt was made by Daniel Bernoulli [1] using a logarithmic function of monetary value, the reason being that the more we have money, the less valuable added money becomes. With this function the series converges, but as Menger [2] points out, it is sufficient to offer a reward that grows in the order of $ 2^{2^n} $ for the paradox to reappear.

Other solutions [3] aren't really satisfactory to me: keeping as a problem constraint that we use the expected value and consider all possible outcomes, there is a game of coin tossing for which it is reasonable to spend all my money as an entry fee.

Let's consider the following problem instead, which I call the Crazy Russian Merchant problem.

At the St. Petersburg market there's a crazy merchant who is the only one selling wheat due to global shortage. To decide wheat cost he tosses a coin until it comes up head, and the wheat price is given by $2^n$ where $n$ is the number of times it came up tail plus one. How much is rational to bring in order to be able to buy wheat?

For this problem to be somewhat realistic we assume that we are broke and so all money must be loaned. We can say at least two things:

if I have to buy wheat once, and I want it with 100% certainty, that's not possible because of the random nature of the coin tosses;
if I have to buy wheat multiple times I can specify a percentage of times I can tolerate to not buy wheat.

In this perspective it's useful to think of probabilities as embedded in the time domain rather than in possible universes, as [4] points out.

Our problem can be modeled as a random variable $Y = 2^X$ where $X$ is a geometrically distributed random variable with parameter $p = 0.5$.

That is, if I require only 50% as a threshold, it's fine to just loan 2\$: that's because $ P(Y \leq 2) = P(X \leq 1) = 0.5 $. So if I buy multiple times with 2\$ I may be able to get wheat half the times. If I require that I need wheat 75% of the times, than I should bring 4\$, because $ P(Y \leq 4) = 0.75 $.

If we don't have a success rate requirement over time though, we don't have any rational reason to bring more than 2\$, that is, the minimum we need to be able to buy wheat with a probability $p$ with $p > 0$. We will be fine being successful 50% of the times. If we bring less we won't ever “win” at this game, because $ P(Y \leq 1) = P(X \leq 0) = 0 $.

Back to the original game, where you win instead of losing, you can't win all the times: what win ratio is rational? This seems dependent entirely on an individual, but let's assume I don't have a time requirement: 2\$ should be the fee.

In both example it's probably 2\$ + $\epsilon$, because in the wheat problem we assumed there's added value from having wheat (either eating it or selling it for higher profit – storing it would be meaningless) and in the original game you always win something, so the casino would be losing money whatever the results.

I don't pretend to have found a particularly important insight mathematically speaking for this problem. Rather reasoning on it has been fun and helpful to learn new things.

Fortunately casinos have already set their fees.

[1] https://www.jstor.org/stable/1909829

[2] Menger (1934) in the english translation by Wolfgang Schoellkopf

[3] https://plato.stanford.edu/archives/sum2022/entries/paradox-stpetersburg/

[4] https://doi.org/10.1098/rsta.2011.0065

Hurst exponent in Crypto (4 years later)

May 16, 2022

The Hurst exponent can be used to determine the fractal dimension of a time series, that is if a series of data over time exhibits some kind of fractal behavior. It compares the series with itself, and there are three possible outcomes: – 0.5, the data is random; – between 0 and 0.5, the data is negatively correlated; – > 0.5 and < 1, the data is positively correlated.

If you've thought about applying it to Bitcoin and friends, you've probably stumbled into this article from 2018 which tries to replicate another paper's findings about market efficiency in four major (at the time) cryptocurrencies.

Given four years have passed, I've decided to try and replicate the results with more data, starting with the same code of the article (since no code is available in the paper). The conclusions in 2018 where mainly two: – the three methods used for approximating the Hurst exponent (DSOD, R/S, DMA) don't yield the same result consistently, so they're somewhat unreliable; – market efficiency in cryptocurrencies couldn't be asserted based on the kind of analysis done in the paper (which only focused on the R/S method for approximating the Hurst exponent).

The code needed a slight modification because the API used to retrieve prices limits its results to only 2000 data points at a time, so two calls are needed to get the full 2013-2022 range of price data. You can find it on GitHub.

Let's first plot BTC, ETH, LTC, and DASH prices up to date (16/05/2022).

Plot of prices up to 2022

By looking at it, it would seem that at least BTC and ETH follow some kind of fractal pattern. But let's try and approximate the Hurst exponent with the three methods used in the article, first an all the data available.

Table of Hurst exponents on all data up to 2022

Now the old data, for comparison:

Table of Hurst exponents from the article

The difference isn't very clear, for instance while for BTC the three methods have a lower variability, for ETH the difference in estimations has increased.

Let's go on and plot a Hurst exponent calculated over the previous 300 days at intervals of 50 days:

Plot of rolling Hurst exponent

While the DSOD and DMA methods exhibit a pretty random behavior, the R/S method (the one also used originally by Hurst) seems a bit more consistent, now that we have more data points, especially for BTC.

If we stick to R/S analysis, the Hurst exponent should be somewhere between 0.5 and 0.7 for BTC, 0.5-0.7 for ETH, 0.4-0.7 for DASH and LTC.

Can we conclude that BTC and ETH are somewhat fractal, up to now? I can't be really sure, we may need to wait for the next big swing up or down. Or for many years of random prices.

For the sake of it, let's do the same calculations over the four major crypto excluding BTC and ETH (by market capitalization, at the time of writing). This are BNB, ADA, XRP and SOL.

4 major altcoins rolling Hurst

The results are pretty much the same.

Now let me challenge the methodology a bit: parameters for computing the Hurst exponent seem to have been chosen rather arbitrarily, not only in the web article, which was just trying to replicate the original paper results, but by that paper itself, which in turn quotes this article in which it's clearly stated that the 300 days rolling window was chosen after some trial and error because other windows exhibited too much variability in results. Furthermore, that paper analyzed stock prices during a crisis, so a completely different thing (we're analyzing cryptocurrencies without any crisis – except for that period in 2020/21).

This of course isn't enough to revert the paper conclusions: given the data and the methods used we can't neither confirm neither deny that cryptocurrency prices are fractal neither prove or refuse the efficient market hypothesis.

Let's try, for instance, with a rolling window of 600 days:

Plot of rolling Hurst exponent over 600 days

The results with the R/S method are better, not much with the others. This makes me question either the other methods or the original R/S analysis. Should we go and invest everything in crypto because it will always go up? Should we dump every crypto we hold and start hating on crypto? As often happens with science, we are a bit more skeptical in regard to both positions and this kind of analysis.

FOSCOS: Best Practices for Web Components

May 5, 2022

The following are a collection of best practices I came up on my own while dealing with Web Components in plain Javascript.

Full disclosure: I am becoming more and more intrigued by The Stackless Way

1. Fetch with ids

Retrieve children by their id, and this way only (they will have the same name as the component you want to access because of the Same identifier principle).

2. One component, one template

Define a new template for each and every component you're using, even if not needed right now. You will probably need it later.

3. Same identifier

Use the same identifier for the following things: – js file name (dashed) – component class name (camel case) – id of custom element in DOM (dashed) – id of template of a custom element (dashed) – name of the custom element (dashed)

For instance, MyCoolComponent is in a file named my-cool-component.js, with a custom element called my-cool-component, with an attribute id="my-cool-component, with a template with the attribute id="my-cool-component.

TODO for lists and stuff like that.

4. Communicate with attributes

Pass data to children setting attributes on them, and let them react with attributeChangedCallback.

TODO for child –> parent communication.

5. One way nesting (the only way that works)

Each Component should have nested components only inside its own shadow DOM. To do so, it must respect the One component, one template principle and then do the following:

	<!-- template of parent component (class ParentComponent) -->
       	<template id="parent-component">
            <!-- other template tags if needed -->
            <child-component id="child-component">
                <!-- slot tags if needed -->
            </child-component>
            <!-- other template tags if needed -->
        </template>

        <!-- template of child component (class ChildComponent) -->
        <template id="child-component">
            <!-- other tags if needed, or other components -->
        </template>

This requires a few lines of boilerplate code in each component constructor, for instance in the ParentComponent of the example above:

const shadowRoot = this.attachShadow({mode: 'open'});
let template = document.getElementById('child-component');
let templateContent = template.content;
shadowRoot.appendChild(templateContent.cloneNode(true));

The one outlined above is also the only way I could get nesting to work. Notice how, when inspecting the rendered DOM, each component doesn't have children outside the shadow DOM with the exception of tags with the slot attribute (which aren't really there anyway).

6. Style outside

Yes, I know that

const style = shadowRoot.appendChild(document.createElement("style"));
style.textContent = "@import url( '/components/bankster.css' );";

is ugly, but at least this way we can keep components smaller and more easily understandable.

Conclusions

While I still haven't used pure JS Web Components in a big production project, it's nice to investigate the possibility of not dealing with a framework. For small projects this will be my go to for the near future, let's see how this technology pans out.

Finished my first game at OverTheWire

May 2, 2022

Today's the day I finally finished bandit at OverTheWire.

If you've used bash a little, most challenges aren't hard, but some require you to think outside of the box.

Don't worry if you're not a programmer or don't know Linux very well: my girlfriend is currently doing and enjoying the challenges, and she barely used the terminal before...

Anyway, I have a few takeaways from this challenge:

it's fun
you learn something along the way
it's much like chess

It's fun

It really is. Sometimes you get stuck and want to never do bandit ever again: you return after a few days if you're of the competitive kind like me in games. And this is a competition mainly with yourself!

You learn something along the way

And you learn more from wrong attempts than from the actual solution. I ended up reading a ton of man pages when sometimes the solution was way simpler than I imagined and for sure picked up new knowledge about Linux.

It's much like chess

Well at least to me: in a chess game I usually hunt for weaknesses in my enemy position or in my enemy way of thinking (this is risky... sometimes I loose because of this way of playing). In bandit, you try to hunt for that specific “line” that grants you access to the next level's password. Checkmate.

My experience with POJOs

22 aprile 2022

This post comes after I was asked to implement a new functionality, pretty easy by itself, but which was to be made inside a mess of objects, interfaces, and stuff like that. It was impossible to do within the current structure, so the choice was between making the nth ugly implementation to make things work or tear everything apart refactoring a good portion of the application, doing a lot of manual tests and coordinating with the other two teams who where using the framework my team was developing. I choose the latter.

If you're a developer in the Java world, you probably know about POJOs: the Plain Old Java Objects. Originally proposed by Fowler, this kind of pattern (if it can be called so) is characterised by a single class with multiple properties, and since we're in Java, this properties are private and each has its own getter and setter. Nothing more, nothing less. You probably have similar things in your favourite language of choice.

So far so good: everyone I know uses POJOs somewhere when developing a new application or functionality.

But here comes the bad. Imagine, for a moment, that you work for a company that's often rushing to meet deadlines. Imagine you have a very complex legacy system, which recently has been somehow hammered here and there to fit into a new and flashing REST API with Swagger and all the other bells and whistles. Maybe imagine it has to integrate various external services, and it's comprised of a back end component and a front end component. Optionally also imagine that said hypothetical company hires people which aren't exactly Java wizards and just a few in your team (or even entire organisation) are good developers. Such a company clearly doesn't exist since it wouldn't survive a lot of years in business, but for the sake of my argument let's imagine it exists.

This company would be in business in a particular domain, so it makes sense that different subsystems may handle very similar data. Often also the same data, but of course with different models. Everyone uses POJOs: let's say we're in finance, you may have “percent” and “percentage” inside “Fund” objects and “CoolFund” objects. Same thing, different name. “We have to create a new object, because we need the veryCoolProperty which your object lacks”.

Days pass. Project managers want features faster, while fixing bugs caused by previous fastness at the same time. Someone, sometime, somewhere has the brilliant idea, to speed things up, to make POJOs smarter and more complex, being able to do all kind of things: proxies are built to conform one subsystem POJO structure to that of another subsystem, various utility classes are made to map them back and forth, clever lambdas are used everywhere to make code more concise and do things with few and obscure lines of code.

Suddenly you have POJOs methods leveraged into interfaces of getter and setter (if you're lucky: otherwise someone threw a Builder pattern here and there and you have setters only deep under layers of software). Suddenly you have POJOs extending each other and no longer have a clear representation of what your data is, how it gets populated and how it's used.

While it's pretty obvious this are all anti patterns, I've seen this things happen, even done by developers better than me and even by me. Why? because we're humans I think. When clients and management pressure increases, your thoughts about patterns and anti patterns start to dissipate leaving room only to delivering that little functionality you have to do, meeting the two days deadline (or two hours).

How to prevent all this, given you can't change yourself, your colleagues or your company?

My takeaway: don't ever ever ever make POJOs complex. I'll be even more radical: don't use getters and setters and make fields public. If you have two different objects who differ only by a property, use the same object with the good old null value. Rule of thumb: only split POJOs if the number of unused fields in a subsystem is bigger than the number of used fields. But forget this rule if the number of used fields is > 5 or they are important fields.

This way extensions and interfaces make even less sense and people will just use your POJO, make their own POJO and everyone will get on with their lives drinking beer happily on the weekend.

Back to my massive refactor, in the end I kept getters and setters, I don't think the Java world is ready for the heresy of public fields (wink wink Python).