Skip to main content

an analysis of Mosers worm

·10545 words·50 mins
Jules Johnson
Author
Jules Johnson
This article is based off a previous writeup on the subject. To read that article, click here.

Introduction
#

Moser’s Worm is an important unsolved problem in geometry with a very amusing name. The name comes from an odd hypothetical. Suppose you have an inch-long pet worm, who’s a very restless sleeper. When the worm goes to bed, there’s no way of knowing how it might be twisted or curved. Nonetheless, you want to cover your worm with a little blanket. What’s the smallest blanket you could use, regardless of how the worm is positioned?

Although many unsolved problems in mathematics are incomprehensible to non-specialists, the standard statement of Moser’s worm only requires requires one definition:

A set of points (in euclidean 2D space) \(A\) accommodates a set \(B\) when there is a set \(B’\) such that \(B’\subseteq A\), and \(B’\) is congruent to \(B\).

Equivalently, a set of points \(A\) accommodates a set \(B\) when there is a set \(A’\) such that \(B\subseteq A’\), and \(A’\) is congruent to \(A\).

For these notes, \(A\) accommodates \(B\) will sometimes be notated with \(A \sqsupseteq B\). If \(B\) does not accommodate \(A\), then we write \(A \sqsupset B\)

From here on out, I might be a little bit looser. I’ll be certain to eliminate ambiguity whenever necesary, but the point of these notes is mostly to ensure that these concepts are written down somewhere they’re easy to locate later on, without necessarily being up to the hyper-rigorous standards of mathematical publication. This is my website, after all. Without further ado, here’s the statement of the problem:

Question 1
What is the smallest convex shape that can accommodate any curve of length 1?

It’s a shockingly simple question for something that has remained unsolved for so long! Although my background is in mathematics, and I studied differential geometry in my undergrad, this question in particular is one I don’t know a lot about. It’s a bit refreshing to work on something unrelated to what you normally do, so I’m looking forward to it! I should mention that I’m deliberately not looking into the already existing body of work on this problem. This is partly for my own satisfaction of discovering things for myself, and partly to avoid bias. Although I don’t suppose that I could fully crack an infamous problem outside my own specialty, it can still be very invigorating for a field to get a fresh pair of eyes and a fresh perspective.

There’s also the second benefit of writing an article on a subject from an outsider’s perspective, which is that it should be very approachable for the reader. Because I don’t know any of the jargon, techniques, or theorems in this field, I can’t possibly confuse a reader by using them. That said, I will use some basic calculus at a couple points.

Type 1 and Type 2 Curves
#

My goal here is to see what curves we don’t need to worry about. There’s two prongs to this:

  1. Show that curve \(A\) is always accommodated when curve \(B\) is accommodated.
  2. Show that when curves \(A\) and \(B\) are accommodated, curve \(C\) is also accommodated.

To that end, we give these definitions:

Given a shape \(a\), the convex hull of \(a\) will be noted \([a]\).

The length of a curve \(X\) will be noted \(\ell X\). The section of an open curve \(X\) between two points \(a\) and \(b\) will be noted with \(X_a^b\), so that the length of an open curve between between two points \(a\) and \(b\) will be noted with \(\ell X_a^b\).

For any set of points \(p_1,p_2\dots\), the polygonal chain that connects each of them in order will be denoted \(\langle p_1, p_2\dots\rangle\).

This entire problem exists in euclidean space. If \(d\) is the euclidean distance metric, and \(a\) and \(b\) are points, then \(\ell\langle a,b\rangle\) will be used as a synonym for \(d(a,b)\)

A unit-length curve \(X\) is taut if there does not exist another unit-length curve \(X’\) such that \([X’] \sqsupset [X]\).

Immediately there are some lemmas we can form regarding taut curves without any further context. For example, this is the lemma that inspired the choice of the word “taut”:

Lemma 1
If a curve experiences any non-zero curvature at a point that is not on the border of its convex hull, it is not taut.

proof:
Let \(X\) be a unit-length curve, and let \(x\) be a contiguous section of \(X\) that does not touch the border of \([X]\), except at two endpoints, \(a\) and \(b\). Then let \(X’\) be identical to \(X\), except that \(x\) is replaced with a straight line from \(a\) to \(b\). The length of \(X’\) is necessarily less than 1, although \(X’\) has the same convex hull. Finally, let \(X’’\) be the unit-length curve that includes all of \(X’\), but with an added length at the end. \(X’’\) must have a convex hull that includes everything in \([X]\) and more.

\(\square\)

Hopefully, this drawing explains why I chose the word "taut" to describe this property. The curve on the left is "taut", in the sense that it could be stretched to coincide with the border of its convex hull, and that extra length could be used to create a larger hull. The light grey area is the same between the two shapes, but the dark grey area is added by using length more efficiently

Here’s another brief and straightforward observation about taut curves.

No self intersecting curves are taut.

Let’s start our effort to characterize taut curves with yet another definition

A Type 1 curve is a curve for which all points on the curve lie on the boundary of its convex hull.

A Type 2 curve is a curve for which there is one single segment that crosses the inside of the convex hull.

In general, a Type n curve is one with \(n\) distinct, discontinuous segments along the boundary of its convex hull, and \(n-1\) distinct segments passing through the interior of the convex hull

An example of a Type 1 and Type 2 curve.

Type 1 and Type 2 curves can also be thought of in terms of path direction. A path along a Type 1 curve moves entirely either clockwise or counterclockwise across the edge of its convex hull. A path along a Type 2 curve starts moving either clockwise or counterclockwise along the border, but then moves in the opposite direction for a length.

Here’s a brief lemma about Type \(n\) curves that will be useful in several results coming up

Lemma 2
Let \(X\) be a taut curve, and let \(a\) and \(b\) be points on \(X\) such that \(X_a^b\) consists only of internal points of \([X]\). \(X_a^b\) has zero curvature.

proof:
Suppose, for the sake of contradiction, that \(X_a^b\) does not have zero curvature. Then \(\ell X_a^b>\ell\langle a,b\rangle\). Let \(X’\) be formed by replacing \(X_a^b\) in \(X\) with \(\langle a,b\rangle\). It is clear that \(\ell X’ < X\), and so \(X\) is accommodated by \(X’\), and is not taut.

\(\square\)

I’ve only talked about Type 1 and Type 2 curves so far. Why hasn’t any special attention been given to curves of other types? As it turns out, these curves are never taut.

Theorem 1
There are no taut Type \(n\) curves, for \(n \geq 3\).

proof:
To force a contradiction, let \(X\) be such a curve. Let \(a\), \(b\), \(c\), \(d\), \(e\), \(f\), and \(g\) be points on \(X\), appearing in that order, with the added requirements that:

  1. \(a\), \(b\), \(c\), \(d\), \(e\), \(f\), and \(g\) are all on the boundary of \([X]\).
  2. \(a\) and \(g\) are the endpoints of \(X\).
  3. \(d\) is the point furthest from \(\langle a,g \rangle\), on the opposite side from \(b\) and \(f\)
  4. \(\langle b,c \rangle\) is internal to the convex hull of \(X\).
  5. \(\langle e,f \rangle\) is internal to the convex hull of \(X\).

Here’s a drawing to explain.

For ease of upcoming calculations, let \(\langle a,g\rangle\) lie on the \(x\) axis, with \(a\) existing at \(0,0\).

Let \(\phi\) be the minimal distance from \(d\) to \(\langle a,g \rangle\). It immediately follows that:

$$ \frac{\ell X_b^f}{2}\leq\phi $$

In this case, define \(a’\) and \(g’\) to be these two points:

$$ a’=\left( 0, -\frac{\ell X_b^f}{2}\right) $$

$$ g’=\left( \ell\langle a,g\rangle, -\frac{\ell X_b^f}{2}\right) $$

Then we can define \(X’\) to be the curve

$$ X’ = \langle a’,a\rangle \cup X_a^b \cup \langle b,f\rangle \cup X_f^g \cup \langle g,g’\rangle $$

This is given in the drawing below:

Because \(a’\) and \(g’\) are both below \(d\), the entirety of \(X\) must be inside \(X’\).

\(\square\)

Let’s limit a bit further which curves can be Type 1.

Lemma 3
Let \(X\) be a Type 1 taut curve with endpoints \(a\) and \(b\). Every point on \(X\) lies on a line perpendicular to \(\langle a, b\rangle\).

proof:
Let \(X\) be a curve as described above, and let \(c\) be the point on \(X\) that lies furthest from any line perpendicular to \(\langle a, b\rangle\). without loss of generality, assume \(c\) is closer to \(a\) than it is to \(b\).

Let \(\alpha\) be the line that includes the line segment \(\langle a, b\rangle\), and let \(c’\) be the point on \(\alpha\) closest to \(c\). Let \(X’\) be a curve identical to \(X\), except that the section between \(c\) and \(a\) is replaced with a straight line between \(c\) and \(c’\).

This situation is illustrated below:

It is clear that

$$ \ell\langle c,c’\rangle < \ell C_c^a $$

and therefore

$$ \ell X’<\ell X $$

Therefore, \(X’\) has a length less than \(X\), and so is a unit curve. The convex hull of \(X’\) can accommodate \(X\), but is larger. By definition, \(X\) is not taut.

\(\square\)

Finally, these past few lemmas and a few other ideas are compiled in a theorem that helps to characterize Type 1 curves:

Theorem 2
A curve \(X\) with endpoints \(a\) and \(b\) is not accommodated by a distinct Type 1 curve if and only if:

  1. All points on \(X\) are on its convex hull.
  2. Every point on \(X\) lies on a line perpendicular to a point on \(\langle a,b \rangle\).
  3. There does not exist a set of points \(\lbrace c, d, e, f\rbrace\) such that:
    a. \(\langle d,e\rangle\) is tangent to \(X\).
    b. \(f\) is one of the end points of \(X\).
    c. \(e\) is a point on \(X\) that is not an end point.
    d. \(\langle c,d\rangle\) is perpendicular to \(\langle f,c\rangle\) and \(\langle d,e\rangle\).
    e. \(\ell \langle a,b\rangle + \ell \langle c,f\rangle + \ell\langle e,d\rangle \leq \ell X_f^e\).

proof:
The fist of these requirements is implied by the definition of a Type 1 curve, and the second is implied by Lemma 3. Therefore, this proof focuses on the third requirement

First, suppose that a set of points as described exists. Without loss of generality, let \(f=a\). As before, I’ll include a diagram to clarify what the situation described in the lemma statement is.

Let \(X’ = \langle c,a,b\rangle \cup X_b^e \cup \langle e,d\rangle\). As evident from the diagram, the convex hull of \(X’\) accommodates the convex hull of \(X\). Next, consider the inequality assumed earlier:

$$\begin{align*} \ell \langle a,b\rangle + \ell \langle c,a\rangle + \ell\langle e,d\rangle &\leq \ell X_a^e \\ \ell \langle c,a,b\rangle + \ell\langle e,d\rangle &\leq \ell X_a^e \\ \ell \langle c,a,b\rangle + \ell X_b^e + \ell\langle e,d\rangle &\leq \ell X_a^e + \ell X_b^e \\ \ell X’ &\leq \ell X \end{align*}$$

Therefore, \([X’]\) both accommodates \([X]\) and includes other area as well, but \(X’\) is shorter than \(X\). By definition, \(X\) is not taught.

An “if and only if” proof requires that the statement is proven in both directions, so Theorem 2 will now be proved in the opposite direction. Suppose that such a set of points as \({c, d, e, f}\) does not exist, but that \(X_1\) is a Type 1 curve such that \([X_1] \sqsupset X\). Let \(c\) and \(d\) be the endpoints of \(X_1\). An example of what this might look like is shown below:

Next, let \(\alpha\) be the line passing through \(c\) and \(d\). let \(X_2\) be the shortest curve with endpoints on \(\alpha\) that accommodates \(X\). \(X_2\) must consist of two lines parallel to \(\alpha\), as well as a section of the boundary of \([X]\), which may or may not include \(\langle a,b\rangle\). Let \(c’\) and \(d’\) be the endpoints of \(X_2\), and let \(e\) and \(f\) be the points where \(X_2\) meets the convex hull of \(X\), as shown below:

Finally, the length of \(X_2\) can be reduced even further. Let \(d’’\) and \(c’’\) be points on \(\langle d’,e\rangle\) and \(\langle c’,e \rangle\) that are both a distance \(\phi\) away from \(d’\) and \(c’\), with \(\phi\) having the largest possible value without \(\langle d’’,c’’\rangle\) intersecting the interior of \([X]\).

Let \(X_3\) be \(X_2\), without the lines from \(d’\) to \(d’’\) and from \(c’\) to \(c’’\). If \(f\) and \(e\) are both endpoints of \(X\), then \(X_3\) must contain the entirety of \(X\). This is impossible, as \(\ell X = 1\), so either \(e\) or \(f\) is not an endpoint of \(X\).

Instead, assume (for the sake of contradiction), that neither \(f\) nor \(e\) are endpoints of \(X\). This would require that at least one of the two points is not on a line parallel to \(\langle a,b \rangle\).

Therefore, \(\lbrace c’’, f, e, d’’\rbrace\) satisfies the requirements of the lemma.

\(\square\)

The reason I set out this lemma is because I want to demonstrate that there is a limit on the “height” of Type 1 curves. If \(\langle a,b \rangle\) is very short, and \(X\) has a point very distant from \(\langle a,b \rangle\), it would be easy to find a way to maximize \(\ell X_a^e\) and minimize \(\ell \langle a,c \rangle\)

This theorem serves to characterize type 1 curves fairly precisely. One of the broader goal of this project is to characterize Type 2 curves equally as well.

Lemma 4
Let \(X\) be a taught Type 2 curve, and let \(a\) and \(d\) be the end points of \(X\). Let \(b\) and \(c\) be the endpoints of the section of \(X\) which are internal to the convex hull of \(X\). It it required that:

$$ \ell \langle b,c\rangle \leq \ell\langle a,c\rangle $$ $$ \ell \langle b,c\rangle \leq \ell\langle b,d\rangle $$

proof:
The locations of \(a\) though \(d\) can be clarified by this diagram:

Suppose for the sake of contradiction that \(\ell\langle a.c\rangle > \ell\langle b,c\rangle\). Define \(X’\) such that

$$ X’ = X_b^a \cup \langle a,c\rangle \cup X_c^d $$

Because \(\ell\langle a,c\rangle > \ell\langle b,c\rangle\), \(\ell X’ < \ell X = 1\). Then let \(X’’\) be identical to \(X’\), except that an extra bit is added to increase the area of the convex hull while keeping \(\ell X’’\leq 1\). By definition, \(X\) is not taut.

By symmetry, \(X\) is also not taut if \(\ell \langle b.d\rangle > \ell \langle b,c\rangle\).

\(\square\)

The non-internal sections of a Type 2 curve behave something like Type 1 curves themselves. For example, consider the following Lemma:

Lemma 5
Let \(X\) be a taut Type 2 curve with endpoints \(a\) and \(d\), and let \(b\) and \(c\) be the endpoints of the section of \(X\) that is internal to \([X]\). Let \(L_1\) be the line passing through \(a\) and \(c\), and let \(L_2\) be the line perpendicular to \(\langle b,d\rangle\) that passes through \(d\). Then any point \(p\) on \(X_c^d\) must lie in the triangle bound by \(L_1\), \(L_2\), and \(\langle c,d\rangle\).

proof:
First, for clarity, a diagram of these lines is shown:

First, if \(p\) is a point on \(X_c^d\) that is on the “wrong” side of \(L_1\), then \(c\) is an internal point of \([X]\). This is not true by assumption.

The point \(p\) must be on the exterior of \([X]\), by assumption, and so cannot be on the wrong side of \(\langle c,d\rangle\).

Finally, suppose for the sake of contradiction that \(p\) lies on the wrong side of \(L_2\). Without loss of generality, let \(p\) specifically be the point furthest from \(L_2\). Then let \(p’\) be the point on the line containing \(b\) and \(d\) that is closest to \(p\). If \(X’ = X_a^p \bigcup \langle p, p’\rangle\), then \(X’\) is shorter than \(X\). Because \(\langle p,p’\rangle\) is perpendicular to \(\langle p’,d\rangle\), \(p’\) must also be further from \(L_2\) than any other point in \(X\). Therefore, \(X’\) accommodates \(X\), giving a contradiction.

\(\square\)

This lemma comes with a few charming corollaries:

Let \(X\) be a Type 2 curve with endpoints \(a\) and \(d\), and let \(b\) and \(c\) be the endpoints of the section of \(X\) that is internal to \([X]\). Let \(L_2\) be the line perpendicular to \(\langle b,d\rangle\) that passes through \(d\). If \(p\) and \(q\) are points on \(X_c^d\) such that \(\ell X_q^d < \ell X_p^d\), then \(q\) is not further from \(L_2\) than \(p\) is.
Let \(X\) be a Type 2 curve with endpoints \(a\) and \(d\), and let \(b\) and \(c\) be the endpoints of the section of \(X\) that is internal to \([X]\). The angles formed by \(\langle c,b,d\rangle\) and \(\langle d,a,c\rangle\) are acute.

Finally, we can make a little theorem effectively characterizing taut Type 2 curves.

Theorem 3
Let \(X\) be a Type 2 curve, and let \(a\) and \(d\) be the end points of \(X\). Let \(b\) and \(c\) be the endpoints of the section of \(X\) which are internal to the convex hull of \(X\). Let \(a’\) be the point that is co-linear to \(a\) and \(c\), such that \(\langle a, a’, b \rangle\) is a right triangle. Similarly, let \(d’\) be the point that is co-linear to \(b\) and \(d\), such that \(\langle c, d’, d \rangle\) is a right triangle. This is shown in the illustration below.

If \(X\) is taught, then:

  1. \(\ell \langle b,c\rangle \leq \ell\langle a,c\rangle\)
  2. \(\ell \langle b,c\rangle \leq \ell\langle b,d\rangle\)
  3. Every point on \(X\) between \(c\) and \(d\) lies in the triangle bounded by \(L_1\), \(L_2\), and \(\langle c,d\rangle\).
  4. Every point on \(X\) between \(a\) and \(b\) lies in the triangle bounded by \(L_3\), \(L_4\), and \(\langle a,b\rangle\).
  5. If \(p\) and \(q\) are points on \(X\) between \(c\) and \(d\), with \(q\) closer to \(d\), then \(q\) is closer to \(L_2\).
  6. If \(p\) and \(q\) are points on \(X\) between \(b\) and \(a\), with \(q\) closer to \(a\), then \(q\) is closer to \(L_4\).

An observation on convex hulls of points and circles
#

This section is likely common knowledge that goes without saying for experts in the field. However, I am not an expert, and I feel that finding a citation for this would be harder than simply proving it myself.

Let \(M\) be a circle with radius \(r\) centered at the origin, and let \(P\) be the point located at \((p,0)\). Every line tangent to \(M\) is perpendicular to a radius of \(M\). Let \(P’\) be a point on \(M\) where the tangent line passes through \(P\). Therefore, a right triangle is formed between the origin, \(P\), and \(P’\).

By the pythagorean theorem, \(\ell\left\langle P,P’\right\rangle=\sqrt{p^2-r^2}\), and the angle formed at \(P\) is \(\arcsin\left(\frac{r}{p}\right)\).

Ultimately, this gives the following lemma.

Lemma 6
The convex hull of a circle of radius \(r\) and a point located at distance \(p\) from that circles center includes two lines of length \(\sqrt{p^2-r^2}\) at an angle of \(2\arcsin\left(\frac{r}{p}\right)\) apart with their vertex at the point, tangent to the circle.

Defining Several Important Shapes
#

Now that the framework for which curves are worth considering has been well established, let’s build up a vocabulary of specific curves to consider. First, what exactly do I mean by “curve”? In a context where curves can be translated and rotated, it makes sense to define a curve more broadly than usual.

When I refer to a curve, I actually mean the congruence class to a specific curve. The term “curve” refers only to a particular shape, and not a position.

When I wish to refer to a curve in a particular position, I’ll mention an instance of a curve. If an instance of a curve is translated or rotated it becomes a different instance of that curve.

As a useful convention, curves will be named in capital letters, while their instances are named in lowercase letters.

Which curves are we going to give names to? In some sense, the simplest unit-length curve is the single line segment. I don’t want to use the letter L or I for this purpose, so I’ll call this \(G\), in reference to the French spelling of the word line, or ligne.

The curve \(G\) will refer to the straight line segment of length 1. Equivalently, \(G\) is congruent to \(\left\langle\left(0,0\right),\left(1,0\right)\right\rangle\)

But what curves are the least thin? There’s a few ways to approach this. The simpler

The maximum radius of a circle accomodated by a single cure’s convex hull is \(\frac{1}{2+\pi}\).
proof:
Consider the circle \(c\) of radius \(r\) defined with \((y-r)^2+x^2=(r)^2\). Suppose \(m\) is a curve with endpoints \(a\) and \(b\) that accomodates \(c\) with minimal length. In order to minimize the length of \(m\), \(\langle a,b \rangle\) must be tangent to \(c\). Without loss of generality, let \(a\) and \(b\) lie on the \(x\) axis. The locations of \(a\) and \(b\) that minimize the length of \(m\) are directly beneath the ends of \(c\), at \((\pm r,0)\)

To minimize the size of \(m\), assume that \(m\) is “taut” against the circle \(c\), as shown in the drawing below:

The total length of \(m\) is therefore \(2r+\pi r\). If \(m\) is a unit length curve, then $$\begin{align*} 1 =2r+\pi r &=r(2+\pi)\\ \frac{1}{2+\pi} &= r \end{align*}$$

\(\square\)

Now that that’s done, we can use this as the basis of the definition of our second useful curve!

The curve \(M\) will refer to the curve that accomodates the circle with radius \(\frac{1}{2+\pi}\). This can be defined as the curve congruent to the set:

$$\begin{rcases} x=\pm\frac{1}{2+\pi} &\text{for } y\in\lbrace -\infty,0 \rbrace \\ x^2+y^2=\frac{1}{(2+\pi)^2} &\text{for } y\in\lbrace 0,r\rbrace \end{rcases}$$

The Broadworm
#

That’s the largest possible circle that can be accomodated with a unit length curve, but is there another way to thinking about how “wide” a curve is? The standard definition of a shapes breadth goes like this:

The width of a shape at an angle \(\theta\) is the minimal distance between two lines at angle \(\theta\) that contain the entirety of that shape.

The breadth of a shape is its minimal width.

This shape, appropriately, is called the broadworm.

The curve \(B\) will refer to the broadworm.

As mentioned, I am not an expert in computational geometry, so my knowledge on the field is limited. However, I do know that much of the work done so far considering Moser’s worm has to do with the so called “broadworm”, or \(B\). I’ve joked with a friend about building the broadworm as a rite-of-passage for anyone interested in Mosers worm. With that in mind, I’m going to start constructing the broadworm with two assumptions. I think they’re both fair to assume as a starting point, although a more propper proof would demonstrate these both from first principles.

  1. \(B\) is a type 1 curve.
  2. \(B\) is symmetric.

For ease, I’ll let the endpoints of \(B\) be labeled \(P_1\) and \(P_3\), and I’ll let the point \(P_2\) be the point furthest from \(\left\langle P_1,P_3\right\rangle\). I’ll let \(d=\ell\left\langle P_1,P_3\right\rangle\) and let \(h\) be the distance between \(P_2\) and \(\left\langle P_1,P_3\right\rangle\). Immediately, we have that the width of \(B\) is at least \(\min(h,d)\).

Let’s use \(w\) to be the width of \(B\). If one considers two parallel lines with \(B\) between them, with one passing through both \(P_1\) and \(P_3\), then the other parallel line would have to pass through \(P_2\). In this case, the width between these two lines is \(h\), giving no new information.

So suppose instead that neither of the two parallel lines pass through both \(P_1\) and \(P_3\). In order to move forward smoothly, I’ll build the following lemma:

Lemma 7

Let \(B\) be a Type 1 curve with endpoints \(P_1=(0,0)\) and \(P_3=(d,0)\). Let \(y_1(x)\) and \(y_2(x)\) be parallel lines with slope \(m\neq 0\) that are of minimum distance apart with \(B\) between them. At least one of \(y_1(x)\) and \(y_2(x)\) pass through a single endpoint of \(B\).

proof:

Because neither \(y_1(x)\) nor \(y_2(x)\) are horizontal, they must both have an \(x\)-intercept. Let’s define \(y_1(x)\) to be the line with a smaller \(x\)-intercept. Suppose first that \(m<0\), so that the entirety of \(B\) lies on or above \(y_1(x)\). Note that the entirety of \(B\) aside from the endpoints lie above and to the right of \(P_1\), meaning that \(y_1(x)\) must pass through \(P_1\).

By symmetry, \(y_2(x)\) must pass through \(P_3\) if \(m>0\).

\(\square\)

This can be taken a step further:

Lemma 8

Let \(B\) be a Type 1 curve with endpoints \(P_1=(0,0)\) and \(P_3=(d,0)\) that has the maximum width possible for curves with length 1. Assume also that \(B\) is symmetric around the line \(x=\frac{d}{2}\). If \(P_2\) is the point of \(B\) with the largest \(y\)-value, then there are two points, called \(E_1\) and \(E_2\), such that

$$B_{P_2}^{P_3}=\left\langle P_2,E_1\right\rangle\Cup B_{E_1}^{E_2}\Cup\left\langle P_2,E_1\right\rangle$$

Where \(B_{E_1}^{E_2}\) is an arc of a circle centered at \(P_1\), and \(\left\langle P_2,E_1\right\rangle\) and \(\left\langle P_2,E_1\right\rangle\) are tangent to that circle. The \(y\)-value of \(P_2\) is \(w\)

In other words, \(B\) consists of two symmetric halves, each of which is a pair of straight lines connected by a circular arc.

proof:

Let \(y_1(x)\) and \(y_2(x)\) be two lines with negative slope such that \(B\) lies entirely on or between \(y_1(x)\) and \(y_2(x)\) with minimal distance between them. Because neither \(y_1(x)\) nor \(y_2(x)\) are horizontal, they must both have an \(x\)-intercept. Let’s define \(y_1(x)\) to be the line with a smaller \(x\)-intercept. We immediately have that the entirety of \(B\) lies on or above \(y_1(x)\). We have defined \(P_2\) to be the highest point of \(B\), so the entirely of \(B_{P_1}^{P_2}\) has positive slope. Therefore, \(y_2(x)\) passes through \(B_{P_2}^{P_3}\).

Let \(w\) be the width of \(B\). Because \(y_1(x)\) passes through \(P_1\), the entirety of \(B_{P_2}^{P_3}\) must lie at a distance of at least \(w\) from \(P_1\). Because \(B\) is a type 1 curve, \(B_{P_2}^{P_3}\) is a segment of the convex hull of \(P_2\), \(P_3\) and the circle centered at \(P_1\) with radius \(w\).

By symmetry, the same is also true for circumstances in which the slope of \(y_1(x)\) and \(y_2(x)\) is positive. We now have that \(B\) consists of:

  • \(S_1\): An arc of the radius \(w\) circle centered at \(P_3\)
  • \(S_2\): A line segment tangent to \(S_1\) passing through \(P_1\)
  • \(S_3\): A line segment tangent to \(S_1\) passing through \(P_2\)
  • \(S_4\): An arc of the radius \(w\) circle centered at \(P_1\)
  • \(S_5\): A line segment tangent to \(S_4\) passing through \(P_3\)
  • \(S_6\): A line segment tangent to \(S_4\) passing through \(P_2\)

Keeping \(d\) and \(w\) constant, it must not be possible to move \(P_2=\left(\frac{d}{2},h\right)\) to a location that decreases the length of \(B\). We already know that \(P_{2y}\geq w\), because it is possible to fit \(B\) inside the parallel lines \(y=0\) and \(y=h\), which have a distance between them of \(h\). It’s nice to note that the distance between \(P_2\) and \(P_1\) is necesarily greater than \(w\), and the same is true for \(P_2\) and \(P_3\).

Lowering \(P_2\) brings it closer to both \(P_1\) and \(P_3\). Therefore, lowering the height of \(P_2\) decreases the length of both \(S_1\cup S_3\) and \(S_4\cup S_6\). Therefore, \(P_{2y}\) is as low as possible, and so is equal to \(h\)

\(\square\)

What is the total length of all these curves? Let’s focus on specifically the right hand side of \(B\). Define \(E_1\) and \(E_2\) to be the endpoints of a circle with radius \(w\) and center \(P_1\), and let them lie at angles of \(\theta_1\) and \(\theta_2\), such that

$$E_1=\left(w\cos(\theta_1),w\sin(\theta_1)\right)$$ $$E_2=\left(w\cos(\theta_2),w\sin(\theta_2)\right)$$

The length of this circular arc is simply \(w(\theta_2-\theta_1)\). Therefore, in order to give \(B\) a length of 1, we need that:

$$\frac{1}{2}=\ell\left\langle P_2,E_2\right\rangle + w(\theta_2-\theta_1) + \ell\left\langle E_1,P_3\right\rangle$$

As discussed in Lemma 6, the length of a line segment tangent to a circle with radius \(r\) to a point with distance \(p\) to the circle’s center is \(\sqrt{p^2-r^2}\). In this case, the radius of the circle is \(w\), and the distance from the origin to \(P_2\) is \(\sqrt{\frac{d^2}{4}+w^2}\). Therefore, the length of \(\left\langle P_2,E_2\right\rangle\) is:

$$\ell\left\langle P_2,E_2\right\rangle=\sqrt{\frac{d^2}{4}+w^2-w^2}=\frac{d}{2}$$

By the same method, we can easily find the length of \(\left\langle E_1,P_3\right\rangle\):

$$\ell\left\langle E_1,P_3\right\rangle=\sqrt{d^2-w^2}$$

The same result also gives the angle between \(P_2\) and \(E_2\) as \(\arccos\left(\frac{w}{\sqrt{\frac{d^2}{4}+w^2}}\right)\). The angle to \(P_2\) is \(\arctan\left(\frac{2w}{d}\right)\). Therefore, we deduce that:

$$\theta_2 =\arctan\left(\frac{2w}{d}\right)-\arccos\left(\frac{w}{\sqrt{\frac{d^2}{4}+w^2}}\right)$$

This can be simplified, as shown[1]In the interest of transparency, this identity was pointed out to me by Emma Joe Anderson. It greatly simplifies calculations further on.:

$$\begin{align*} \theta_2 &=\arctan\left(\frac{2w}{d}\right)-\arccos\left(\frac{w}{\sqrt{\frac{d^2}{4}+w^2}}\right)\\ &=\arctan\left(\frac{2w}{d}\right)-\arctan\left(\frac{d}{2w}\right)\\ &=2\arctan\left(\frac{2w}{d}\right)-\frac{\pi}{2}\\ \end{align*}$$

And the same method gives \(\theta_1\) as well:

$$\theta_1=\arccos\left(\frac{w}{d}\right)$$

Finally, to give \(B\) a length of 1, we can get by substitution:

$$\frac{1}{2}=\frac{d}{2} + w\left(-\frac{\pi}{2}+2\arctan\left(\frac{2w}{d}\right)-\arccos\left(\frac{w}{d}\right)\right) + \sqrt{d^2-w^2}$$

Therefore, the value of \(w\) is simply a function of \(d\), and our goal is to find the value of \(d\) that maximizes it. To do this, we’ll take the derivative with respect to \(d\):

$$0 =\frac{1}{2}+ w’\left(-\frac{\pi}{2}+2\arctan\left(\frac{2w}{d}\right)-\arccos\left(\frac{w}{d}\right)\right)+w\left(\frac{4dw’-4w}{d^2+4w^2}+\frac{dw’-w}{d^2\sqrt{1-\frac{w^2}{d^2}}}\right)+\frac{d-ww’}{\sqrt{d^2-w^2}}$$

Because we want to maximize \(w\), we’ll set \(w’\) equal to 0:

$$\begin{align*} 0 &=\frac{1}{2}+w\left(\frac{-4w}{d^2\left(1+\frac{4w^2}{d^2}\right)}+\frac{-w}{d^2\sqrt{1-\frac{w^2}{d^2}}}\right)+\frac{d}{\sqrt{d^2-w^2}}\\ 0 &=\frac{1}{2}-\frac{4w^2}{d^2+4w^2}-\frac{w^2}{d\sqrt{d^2-w^2}}+\frac{d}{\sqrt{d^2-w^2}}\\ 0 &=\frac{d^2+4w^2}{2d^2+8w^2}-\frac{8w^2}{2d^2+8w^2}+\frac{d^2-w^2}{d\sqrt{d^2-w^2}}\\ 0 &=\frac{d^2-4w^2}{2d^2+8w^2}+\frac{d^2-w^2}{d\sqrt{d^2-w^2}}\\ 0 &=\frac{d^2-4w^2}{2d^2+8w^2}+\frac{\sqrt{d^2-w^2}}{d}\\ 0 &=\frac{\left(d^2-4w^2\right)^2}{\left(2d^2+8w^2\right)^2}-\frac{d^2-w^2}{d^2}\\ 0 &=\frac{\left(d^4-8d^2w^2+16w^4\right)}{4\left(d^4+8d^2w^2+16w^2\right)}+\frac{w^2-d^2}{d^2}\\ 0 &=d^2\left(d^4-8d^2w^2+16w^4\right)+4\left(d^4+8d^2w^2+16w^4\right)\left(w^2-d^2\right)\\ 0 &=d^6-8d^4w^2+16d^2w^4+4d^4w^2+32d^2w^4+64w^6-4d^6-32d^4w^2-64d^2w^4\\ 0 &=64w^6-16d^2w^4-36d^4w^2-3d^6\\ \end{align*}$$

This is a cubic equation! Giving its precice solution would be tedious[2]Especially for work that’s already been done by other mathematicians, so I’ll simply note that Wolfram Alpha gives its solutions as

$$w^2\approx -0.58022d^2$$ $$w^2\approx -0.08798d^2$$ $$w^2\approx 0.91821d^2$$

Both \(w\) and \(d\) are real-valued, so we’ll use the last of these. Taking the square root gives the ratio betwen \(w\) and \(d\).

$$w=0.95823d$$

Plugging this into the previous equation from the length of \(d\) gives:

$$\begin{align*} \frac{1}{2} &\approx\frac{d}{2} + 0.95823d\left(-\frac{\pi}{2}+2\arctan\left(1.91646\right)-\arccos\left(0.95823\right)\right) + \sqrt{d^2-0.91821d^2}\\ \frac{1}{2} &\approx 0.5d + 0.95823d\left(-1.5709.+2.17973-0.29005\right)+0.28600d\\ \frac{1}{2} &\approx 4.0445d\\ d &\approx 0.45806 \end{align*}$$

Based on this, we arrive at:

$$w=0.43852$$

Additional curves
#

How else could we think of the curve most unlike \(G\)? The third way we might consider is to search for the curve with a point furthest from the line connecting its endpoints. This is the equilateral triangle with sides of length \(\frac{1}{2}\).

For any angle \(\theta\), Let \(V_\theta\) be the union of two line segments of length \(\frac{1}{2}\) meeting at an angle of \(\theta\). Equivalently, \(V_\theta\) is congruent to

$$\left\langle \left(-\frac{1}{2}\cos{\frac{\theta}{2}},0\right), \left(0,\frac{1}{2}\sin{\frac{\theta}{2}}\right), \left(\frac{1}{2}\cos{\frac{\theta}{2}},0\right) \right\rangle $$

For convenience, define \(E=V_{\frac{\pi}{6}}\).

As you might have guessed, the name \(V\) was chosen due to its resemblance to an angle. The name \(E\) was chosen because of the similarity to an equilateral triangle. As before, it’s worthwhile to show that the length of this curve is 1. The distance between \(\left(0,\frac{1}{2}\sin{\frac{\theta}{2}}\right)\) and \(\left(\frac{1}{2}\cos{\frac{\theta}{2}},0\right)\) is given by:

$$ \sqrt{\left(\frac{1}{2}\sin{\frac{\theta}{2}}\right)^2 + \left(\frac{1}{2}\cos{\frac{\theta}{2}}\right)^2} = \sqrt{\frac{1}{4}\sin^2{\frac{\theta}{2}} + \frac{1}{4}\cos^2{\frac{\theta}{2}}} = \frac{1}{2} $$

What other simple curves are important enough to be given brief names? There’s a fair argument to be made that the simplest curves would be those of constant curvature, also known as arcs.

For any radius \(r\), Let \(A_r\) be the unit length arc of a circle with radius \(r\). Equivalently, \(A_r\) is congruent to

$$y=\sqrt{r^{2}-x^{2}}-r\sin\left(\frac{\pi r-1}{2r}\right)$$

for \(y>0\)

It’s worth showing that the shape given in this definition does in fact have a length of 1. The two places where \(y=0\) are given with:

$$\begin{align*} 0 &= \sqrt{r^{2}-x^{2}}-r\sin\left(\frac{\pi r-1}{2r}\right)\\ r\sin\left(\frac{\pi r-1}{2r}\right) &= \sqrt{r^{2}-x^{2}}\\ r^2\sin^2\left(\frac{\pi r-1}{2r}\right) &= r^2-x^2\\ r^2 - r^2\sin^2\left(\frac{\pi r-1}{2r}\right) &= x^2\\ r^2\left(1-\sin^2\left(\frac{\pi r-1}{2r}\right)\right) &= x^2\\ r^2\cos^2\left(\frac{\pi r-1}{2r}\right) &= x^2\\ \pm r\cos\left(\frac{\pi r-1}{2r}\right) &= x \end{align*}$$

Note also that \(y’=\frac{x}{\sqrt{r^{2}-x^{2}}}\). Using the standard formula for length of a curve gives:

$$\begin{align*} \int_{-r\cos\left(\frac{\pi r-1}{2r}\right)}^{r\cos\left(\frac{\pi r-1}{2r}\right)}\sqrt{1+y’^2}dx &= \int_{-r\cos\left(\frac{\pi r-1}{2r}\right)}^{r\cos\left(\frac{\pi r-1}{2r}\right)}\sqrt{1+\frac{x^2}{r^2-x^2}}dx\\ &= \int_{-r\cos\left(\frac{\pi r-1}{2r}\right)}^{r\cos\left(\frac{\pi r-1}{2r}\right)}\sqrt{\frac{r^2}{r^2-x^2}}dx\\ &= \int_{-r\cos\left(\frac{\pi r-1}{2r}\right)}^{r\cos\left(\frac{\pi r-1}{2r}\right)}\frac{1}{\sqrt{1-\frac{x^2}{r^2}}}dx\\ &= \left.-r\arccos\left(\frac{x}{r}\right)\right|_{-r\cos\left(\frac{\pi r-1}{2r}\right)}^{r\cos\left(\frac{\pi r-1}{2r}\right)}\\ &= -r\arccos\left(\frac{r\cos\left(\frac{\pi r-1}{2r}\right)}{r}\right) + r\arccos\left(\frac{-r\cos\left(\frac{\pi r-1}{2r}\right)}{r}\right)\\ &= -r\arccos\left(\cos\left(\frac{\pi r-1}{2r}\right)\right) + r\arccos\left(-\cos\left(\frac{\pi r-1}{2r}\right)\right)\\ &= -r\left(\frac{\pi r-1}{2r}\right) + r\left(-\arccos\left(\cos\left(\frac{\pi r-1}{2r}\right)\right)+\pi\right)\\ &= -r\left(\frac{\pi r-1}{2r}\right) + r\left(-\left(\frac{\pi r-1}{2r}\right)+\pi\right)\\ &= -\frac{\pi r-1}{2} -\frac{\pi r-1}{2}+\pi r\\ &= -\pi r+1 +\pi r = 1 \end{align*}$$

But notice now that all of these have a roughly similar shape! They all have their endpoints at the diameter length apart, and bunch up in the center. In other words, \(M\), \(E\), and \(B\) are all shaped a little bit like circular arcs. To get away from this, let’s define the Type 1 rectangles:

The curve \(R_s\) is the set of curves congruent to:

$$\left\langle (0,0), \left(0,s\right), \left(1-2s,2\right), (1-2s,2)\right\rangle$$

Or the rectangle with a length of \(1-2s\) and a width of \(s\).

Type 2 curves
#

Finally, let’s look into Type 2 curves. A polygonal chain is an ordered set of points along with the line segments connecting adjacent points. For our purposes here, we’ll consider only the 3-sided polygonal chains with parallel first and final sides. We’ll do this with three degrees of freedom, as defined here:

Given the following four points:

$$\begin{align*} Z_1 &= \left(0,0\right)\\ Z_2 &= \left(\frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta,0\right)\\ Z_3 &= \left(\frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta+\gamma,\alpha\right)\\ Z_4 &= \left(1-\sqrt{\alpha^2+\gamma^2}+\gamma,\alpha\right)\\ \end{align*}$$

Let (Z(\alpha,\beta,\gamma)\) be the class of curves congruent to \(\left\langle Z_1, Z_2, Z_3, Z_4\right\rangle\)

I’ll show that the defined shapes are indeed valid Type 2 curves before explaining the motivation for this definition. The lines \(\left\langle Z_1, Z_2\right\rangle\) and \(\left\langle Z_3, Z_4\right\rangle\) are parallel, as their definitions here give them each a slope of 0. Additionally, the total length of this curve is given by:

$$\begin{align*} \ell\left\langle Z_1, Z_2, Z_3, Z_4\right\rangle &= \ell\left\langle Z_1, Z_2\right\rangle + \ell\left\langle Z_2, Z_3\right\rangle + \ell\left\langle Z_3, Z_4\right\rangle\\ &= \frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta + \ell\left\langle Z_2, Z_3\right\rangle + 1-\sqrt{\alpha^2+\gamma^2}+\gamma-\left(\frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta+\gamma\right)\\ &= \ell\left\langle Z_2, Z_3\right\rangle + 1-\sqrt{\alpha^2+\gamma^2}\\ &= \sqrt{\left(\frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta+\gamma-\left(\frac{1-\sqrt{\alpha^2+\gamma^2}}{2}+\beta\right)\right)^2+\alpha^2} + 1-\sqrt{\alpha^2+\gamma^2}\\ &= \sqrt{\gamma^2+\alpha^2} + 1-\sqrt{\alpha^2+\gamma^2} = 1\\ \end{align*}$$

These three variables have the following motivations:

  • \(\alpha\) is the height of \(Z_3\) and \(Z_4\) above \(Z_1\) and \(Z_2\).
  • \(\beta\) is the extra length removed from \(\left\langle Z_3, Z_4\right\rangle\) and added to \(\left\langle Z_1, Z_2\right\rangle\).
  • \(\gamma\) is the distance by which \(Z_3\) is horizontally offset from \(Z_2\).

For convenience, we can define a “basic” version of this curve

$$Z_h=Z\left(h,0,0\right)$$

This is shown here:

A hopeful reader might wonder if this curve might in fact be accomodated by a type 1 curve. \(Z_h\) certainly doesn’t look like it could be accomodated by a type 1 curve. First, I’ll note the exact positions of points in \(Z_h\):

$$\begin{align*} Z_1 &= \left(0,0\right)\\ Z_2 &= \left(\frac{1-h}{2},0\right)\\ Z_3 &= \left(\frac{1-h}{2},h\right)\\ Z_4 &= \left(1-h,h\right)\\ \end{align*}$$

Let’s first note that an optimal type 1 curve accomodating \(Z_h\) would contain the lines connecting three vertices of \(Z_h\). Without loss of generality, let’s let those points be \(Z_1\), \(Z_3\), and \(Z_4\). The distance between those points is given by:

$$\begin{align*} \ell\left\langle Z_1, Z_3, Z_4\right\rangle &= \ell\left\langle Z_1, Z_3\right\rangle + \ell\left\langle Z_3, Z_4\right\rangle\\ &= \sqrt{\left(\frac{1-h}{2}\right)^2+h^2} + \frac{1-h}{2}\\ &= \sqrt{\frac{1-2h+5h^2}{4}} + \frac{1-h}{2}\\ &= \frac{\sqrt{1-2h+5h^2}+1-h}{2}\\ \end{align*}$$

If a type 1 curve \(X\) accomodates \(Z_h\), then it contains \(\left\langle Z_1, Z_3, Z_4\right\rangle\) as well as an additional stretch that allows \(Z_2\) to be included. \(Z_2\) is closer to \(Z_1\) than it is to \(Z_4\), so the best case scenario is that \(X\) includes a point on the line passing through \(Z_2\) and \(Z_4\). This line is given by:

$$y=\frac{2h}{1-h}\left(x-\frac{1-h}{2}\right)=\frac{2h}{1-h}x-h$$

The distance from the origin to this line is given by:

$$\frac{h\left(\frac{1-h}{2}\right)}{\sqrt{h^2+\frac{(1-h)^2}{4}}}=\frac{\left(\frac{h-h^2}{2}\right)}{\sqrt{\frac{1-2h+5h^2}{4}}}=\frac{\left(h-h^2\right)}{\sqrt{1-2h+5h^2}}$$

Unfortunately, this means that the full length of \(X\) is given by:

$$\begin{align*} \ell X &= \frac{\left(h-h^2\right)}{\sqrt{1-2h+5h^2}}+\frac{\sqrt{1-2h+5h^2}+1-h}{2}\\ &= \frac{\left(2h-2h^2\right)}{2\sqrt{1-2h+5h^2}}+\frac{\left(\sqrt{1-2h+5h^2}+1-h\right)\sqrt{1-2h+5h^2}}{2\sqrt{1-2h+5h^2}}\\ &= \frac{1+3h^2+\left(1-h\right)\sqrt{1-2h+5h^2}}{2\sqrt{1-2h+5h^2}}\\ \end{align*}$$

This function is above 1 for all values of \(h\) between 0 and 1. Therefore, \(X\) has a length above 1, and so \(Z_h\) is a real type 2 curve.

Convex Hulls of Multiple Curves
#

My plan is to spend this section establishing some useful lemmas about the convex hulls of multiple curves, as a bit of set dressing for the sections to come. The nature of this problem requires that we seek to minimize the size of a convex hull, so I’ll give a definition relating to this:

Given a set of curves \(\lbrace A,B,\dots\rbrace =\Omega\), let \(\lbrace a,b,\dots\rbrace=\omega\) be a set of containing exactly one instance of each curve in \(\omega\). define \([\omega]\) to be the single convex hull of the union of \(\omega\). Define \([\Omega]\) to be the set of all possible instances of \(\omega\).

Let \([\omega]\) be a convex hull in \([\Omega]\). We say a\([\omega]\) is minimal if it has the smallest possible area of all elements of \([\Omega]\). The set of minimal convex hulls in \([\Omega]\) will be denoted with \([\Omega]^-\).

For convenience, we order \([\Omega]\) by area, so that it is a weakly ordered set. We use \(\omega_1\lesssim \omega_2\) to indicate that \(\omega_1\) is smaller than or equal in size to \(\omega_2\). By this ordering, \([\Omega]^-\) is the set of minimal elements of \([C]\).

Here’s the first lemma for this section:

Lemma 9 Let \(\Omega=\lbrace A, B, C, \dots\rbrace\) be a set of curves, and let \(\omega=\lbrace a, b, c, \dots\rbrace\) be a set of their instances, such that \([\omega] \in [\Omega]^-\). If there is an instance of \(A\) inside \([\omega]\) that does not touch its boundary, then \([\omega]\in[\Omega/A]^-\).

proof:
Let \(a’\) be the instance of \(A\) that is internal to \(X\) but does not touch its boundary, let \(\omega’=\lbrace a’, b, c, \dots\rbrace\), and let \(X’\) be the convex hull of the union of \(\omega’\).

It can be immediately seen by definition that \(\omega’\in\Omega\), and therefore that \(X’\in [\Omega]\). Because \(X\) is a minimal element of \([\Omega]\), \(X\lesssim X’\). \(X’\) contains no points outside the boundary of \(X\), and so \(X\nless X’\).

Therefore, \(X’\) has the same area as \(X\). This implies that \(X=X’\).

\(\square\)

In order to produce the next lemma. we give a special definition:

Let \(A\) and \(B\) be two shapes, and let \(a\) and \(b\) be instances of those shapes. We say that \(a\) and \(b\) are fully intersecting if there exists no linear translation of \(a\), called \(a’\), such that \([b\cup a’]\) is a strict subset of \([b\cup a]\)
It should be noted that “fully intersecting” and minimal are not the same thing. Given two shapes \(A\) and \(B\), and a pair of instances \(a\) and \(b\), it is possible that \(a\) and \(b\) may be fully intersecting, even if \([a,b]\not\in[A,B]^-\). On the other hand, it is true that if \(a\) and \(b\) are not fully intersecting, then \([a,b]\not\in[A,B]^-\).

This definition comes with an almost immediate lemma:

Lemma 7
Let \(\Omega\) be a set of curves, and let \(\omega\) be a set of their instances. If \(A\) and \(B\) are elements of \(\Omega\), then \([\omega]\) accomodates some pair of instances \(a\) and \(b\), where \(a\) and \(b\) are fully intersecting.

proof:

Let \(a,b\in\omega\). First, we know that \([\omega]\sqsupseteq[a,b]\). We also know that if \(a\) and \(b\) are not fully intersecting, they at least have a fully intersecting pair of linear transformations, \(a’\) and \(b’\), such that \([a,b]\sqsupseteq[a’,b’]\). By the transitive property, \([\omega]\sqsupseteq[a’,b’]\)

\(\square\)

We can extend this lemma just a little bit further, finally reaching a theorem about the nature of minimal convex hulls.

Theorem 4
Let \(\Omega\) be a set of curves. If \(\omega\) is a set of instances, such that \([\omega]\in[\Omega]^-\), then every pair of elements in \(\omega\) are fully intersecting.

proof:

Let \(\omega\) be such that \([\omega]\in[\Omega]^-\). In order to force a contradiction, assume that there exists \(a,b\in\omega\) such that \(a\) and \(b\) are not fully intersecting. Because of Lemma 9, we may assume that both \(a\) and \(b\) touch the boundary of \([\omega]\). In fact, they both extend the boundary of \([\omega]\), meaning that removing them would result in a smaller overall convex hull.

Let’s assume (WLOG) that \(a’\) is a linear transformation of \(a\), such that \(a’\sqsubset[a,b]\). Let \(\omega’\) be formed by replacing \(a\) in \(\omega\) with \(a’\). Note that, by the definition of fully intersecting, \([a,b]\sqsupseteq[a’b]\), and therefore, \([\omega]\sqsupseteq[\omega’]\). This means that if there is any point on the boundary of \([\omega]\) that is not present in \([\omega’]\), then \([\omega]\sqsupset[\omega’]\). But this cannot be the case if \([\omega]\in[\Omega]^-\), so it must be that \([\omega]=[\omega’]\).

We now know that there are some points present in \([a,b]\) that are absent in \([a’,b]\), but that none of these points are on the boundary of \([\omega]\). As mentioned, \(a\) lies on the boundary of \([\omega]\). If \(a’\) is also on the boundary of \([\omega]\), that means that \(a\) must not have actually expanded the boundary of \([\omega]\). This also violates the spirit of Lemma 9, giving a contradiction.

\(\square\)

For this lemma to be maximally useful, it’s a good idea to characterize better what makes two curves fully intersecting.

Theorem 5
Let \(c\) and \(d\) be closed convex shapes, neither of which is a subset of the other, and neither of which has an area of 0. Suppose also that the boundaries of \(c\) and \(d\) intersect at fewer than three points.

If \([c,d]/\left(c\cup d\right)\) is discontinuous, \(c\) and \(d\) are not fully intersecting.

proof:

To use more down-to-earth terminology, if \([c,d]/\left(c\cup d\right)\) is discontinuous, this means that \([c,d]\) adds two non-adjacent regions to \(\left(c\cup d\right)\). An example of this is shown below. The dark grey regions are the interiors of \(c\) and \(d\), while the light grey regions are the areas added by \([c,d]\)

Let \(\left\langle c_1,d_1\right\rangle\) and \(\left\langle c_2,d_2\right\rangle\) be the two lines on the border of \([c,d]\) that are not part of \(c\) or \(d\), where \(c_1\) and \(c_2\) lie on \(c\), and \(d_1\) and \(d_2\) lie on \(d\). Assume, without loss of generality, that \(\ell\left\langle c_1,c_2\right\rangle \leq \ell\left\langle d_1,d_2\right\rangle\), so that the two lines are either parallel, or slanting to be closer near \(c\). This is shown below:

For what comes next, consider rotating \([c,d]\) such that the slope of \(\left\langle c_1,d_1\right\rangle\) is non-negative, and the slope of \(\left\langle c_2,d_2\right\rangle\) is non-positive, as in the previous diagram. For every point \(p\) in \(c\), there is a positive distance between \(p\) and the point on the border of \(d\) and \([c,d]\) that shares \(p\)’s \(y\)-coordinate. If there wasn’t, then \(p\) would lie on the the border of \(d\), violating one of our assumptions. Let \(x\) be the minimal possible such distance, and let \(c’\) be the result of shifting \(c\) a distance of \(x\) towards \(d\).

We have now demonstrated that \([c’,d]\subset[c,d]\), and by definition, \(c\) and \(d\) are not fully intersecting.

\(\square\)

Corollary 5.1

Let \(g\) be an instance of \(G\), and let \(c\) be a closed convex shape that does not accommodate \(G\). If either endpoint of \(g\) is internal to \(c\), then \(g\) and \(c\) are not fully intersecting.

proof:

Let \(g_1\) be the endpoint of \(g\) that is internal to \(c\), meaning that it is in \(c\), but not on its boundary. Let \(g_2\) be the endpoint of \(g\) that is external to \(c\). There exists a closed ball with positive radius \(r\) around \(g_1\) entirely in \(c\). Let \(g_2’\) be the point on \(g\) at distance \(r\) from \(g_2\), and let \(g_1’\) be the point 1 unit from \(g_2’\) such that \(\left\langle g_1’,g_2’\right\rangle\) passes through \(g_1\). furthermore, let \(g’=\left\langle g_1’,g_2’\right\rangle\). We may observe that \(g_1’\) an \(g_2’\) are both in \([c,g]\), but \(g_2\) is not in \([c,g’]\). By definition, \([c,g’]\subset[c,g]\), and so \(g\) and \(c\) are not fully intersecting.

\(\square\)

For the final result of this this section, it’ll be helpful to note one final definition. To be clear, this isn’t a definition of my own, but one that is commonly used.

Given a shape \(A\), the maximal distance between any two points in \(A\) is called \(A\)’s diameter.

In mathematical equations, the diameter of \(A\) will be notated with \(\text{diam}(A)\)

The theorem we’re attempting to develop with this is:

Theorem 6

For any convex shapes \(C\) and \(D\), the diameter of any element of \([C,D]^-\) is \(\text{max}\left(\text{diam}\left(C\right), \text{diam}\left(D\right)\right)\).

Showing this is quite hard! It seems true, but that’s hardly proof. I’m spinning off this discussion into its own mini-article, so it doesn’t become a massive part of this one.

Moserville
·558 words·3 mins

Examples of Minimal Covers on Finite Sets
#

In a previous section, I limited the space of curves worth considering by defining taughtness. I showed that some curves are accomodated by other curves, and therefore not worth considering. In the next section, I’ll try to limit the space of curves even further by finding some curves that are accomodated by sets of other curves.

A curve \(X\) is proxy accommodated by a set of curves \(\Xi\) if \(X\) is accommodated by every shape in \([\Xi]^-\).

My initial goal was to hopefully find a finite set of taut curves that proxy accommodate all other taut curves, and then to find the smallest set that contains all curves in that finite set. This goal seems reasonable at first, as it is much easier to find \([\Xi]^-\) when \(\Xi\) is a finite set of curves as opposed to an infinite one. There are some severe problems with this as an objective, but there’s still much to be learned by exploring it!

In this section, to illustrate that it is in fact possible to find the minimal size of covering shapes for certain finite numbers of curves, we’ll find a few examples. These next two theorems are 5% insight and 95% bookkeeping, so feel free to skip their proofs if you’re in a hurry.

Theorem 7
All elements of \([S,G]^-\) are congruent to a quadrilateral defined with:

$$\left\langle \left(\frac{1}{3},0\right), \left(a,\frac{1}{3}\right), \left(0,\frac{1}{3}\right), \left(b,0\right), \left(\frac{1}{3},0\right)\right\rangle$$

Where \(b<0\), \(a>\frac{1}{3}\), and \(a-b = \frac{2\sqrt{2}}{3}\). Their area is approximately \(0.213\).

proof:
First, assume without loss of generality that the instance of \(S\) is located on the points \((0,0)\), \(\left(\frac{1}{3},0\right)\), \(\left(\frac{1}{3},\frac{1}{3}\right)\), and \(\left(0,\frac{1}{3}\right)\), as in the definition’s example. Call this instance \(s\). Let the intance of \(G\) have endpoints \(A\) and \(B\), so that \(\ell\langle A,B\rangle = 1\). Define this instance of the line so that \(\langle A,B\rangle = g\). Note that the line between \(A\) and \(B\) must pass through the square in order for these shapes to be fully intersecting.

Of course, \([s,g]\) is the convex hull of these points. A potential instance of \([s,g]\) is shown in the illustration below:

\([s,g]\) can be deconstructed into the central square and multiple triangles:

Let \(A=(a_x,a_y)\), and let \(B=(b_x,b_y)\). Additionally, let \(f(x)\) be the distance of \(x\) to the interval \(\left[0,\frac{1}{3}\right]\), so that

$$f(x) = \begin{cases} -x &\text{if } x\leq 0 \\ 0 &\text{if } x\in \left(0,\frac{1}{3}\right)\\ x-\frac{1}{3} &\text{if } x \geq \frac{1}{3} \end{cases}$$

To give an example of how this could be useful, consider the triangle formed by \(a_x\). If \(A\) is the lower point on the figure above, this would be the green triangle. Using the standard formula for area of a triangle, we get \(\frac{1}{2}\cdot\frac{1}{3}\cdot f(a_x)\)

Then, the full area of \([s,g]\) would be given by:

$$\frac{1}{9}+\sum_{i} \frac{f(i)}{6}$$

Where \(i\) is an element of \(\left\lbrace a_x,a_y,b_x,b_y\right\rbrace \). The presence of \(\frac{1}{9}\) is due to the area of \(s\). There is also the limitation from the length of \(g\) which implies that:

$$\begin{align*} 1 &= \ell \langle A,B\rangle\\ 1 &= \sqrt{\left(a_x-b_x\right)^2+\left(a_y-b_y\right)^2}\\ 1 &= \left(a_x-b_x\right)^2+\left(a_y-b_y\right)^2 \end{align*}$$

For the time being, let’s hold \(A\) constant, and consider the optimal placement of \(B\), assuming without loss of generality that \(b_x \geq 0\) and that \(b_y\geq \frac{1}{3}\). Solving the previous equation for \(b_y\) gives

$$\begin{align*} 1 &= \left(a_y-b_y\right)^2+\left(a_x-b_x\right)^2\\ 1- \left(a_y-b_y\right)^2 &= \left(a_x-b_x\right)^2\\ \sqrt{1- \left(a_y-b_y\right)^2} &= a_x-b_x\\ a_x - \sqrt{1- \left(a_y-b_y\right)^2} &= b_x \end{align*}$$

Therefore, the maximum value of \(b_x\) is:

$$b_x \leq a_x - \sqrt{1- \left(a_y-\frac{1}{3}\right)^2}$$

By symmetry, we also know that \(b_y= a_y - \sqrt{1- \left(a_x-b_x\right)^2}\). Therefore, the total area due to \(B\) can be expressed as:

$$A_B = f(b_x) + f(b_y) = f(b_x) + b_y = f(b_x) + a_y - \sqrt{1- \left(a_x-b_x\right)^2} $$

This function is non-differentiable or has boundaries when \(b_x=0\), when \(b_x=\frac{1}{3}\), and when \(b_x= a_x - \sqrt{1- \left(a_y-\frac{1}{3}\right)^2}\). If \(x\in\left[0,\frac{1}{3}\right]\), then

$$A_B = a_y - \sqrt{1- \left(a_x-b_x\right)^2}$$

The derivative of this is

$$\frac{dA_B}{db_x} = \frac{2a_x-2b_x}{2\sqrt{1- \left(a_x-b_x\right)^2}} = \frac{a_x-b_x}{\sqrt{1- \left(a_x-b_x\right)^2}}$$

This is equal to 0 only when \(b_x = a_x\). On the other hand, if \(x > \frac{1}{3}\), then

$$A_B = b_x-\frac{1}{3} + a_y - \sqrt{1- \left(a_x-b_x\right)^2} $$

The derivative of this is quite similar to the derivative from before:

$$\frac{dA_B}{db_x} = 1 + \frac{a_x-b_x}{\sqrt{1- \left(a_x-b_x\right)^2}}$$

For this derivative to be equal to 0, this comes to:

$$\begin{align*} 0 &= 1 + \frac{a_x-b_x}{\sqrt{1- \left(a_x-b_x\right)^2}}\\ 1 &= \frac{b_x-a_x}{\sqrt{1- \left(a_x-b_x\right)^2}}\\ \sqrt{1- \left(a_x-b_x\right)^2} &= b_x-a_x\\ 1- \left(a_x-b_x\right)^2 &= \left(b_x-a_x\right)^2\\ 1 &= 0 \end{align*}$$

This is obviously not possible, meaning that the critical points of this area function are when \(b_x=0\), when \(b_x=\frac{1}{3}\), when \(b_y=\frac{1}{3}\), or when \(0 < b_x=a_x < 0\).

If \(b_x,a_x \in \left[0,\frac{1}{3}\right]\), then the formula for area is given by \(a_y - \sqrt{1- \left(a_x-b_x\right)^2}\). It is clear from inspection that this function is larger when \(a_x=b_x\) as opposed to otherwise. Because of this, we conclude that \(B\) lies on either \(x=0\), \(x=\frac{1}{3}\), \(y=0\), or \(y=\frac{1}{3}\). By symmetry, \(A\) also lies on one of these lines. This leads to one of two possibilities:

In one possibility, assume that both \(A\) and \(B\) lie on \(x=0\) and \(y=0\). Without loss of generality, let \(b_x = 0\) while \(a_x = \frac{1}{3}\). Then the distance between \(A) and \(B\) gives the restriction that:

$$\begin{align*} 1 &= \sqrt{\left(a_x-b_x\right)^2+\left(a_y-b_y\right)^2}\\ 1 &= \sqrt{\left(\frac{1}{3}\right)^2+\left(a_y-b_y\right)^2}\\ 1 &= \frac{1}{9}+\left(a_y-b_y\right)^2\\ \frac{8}{9} &= \left(a_y-b_y\right)^2\\ \frac{2\sqrt{2}}{3} &= |a_y-b_y| \end{align*}$$

Assuming that the square lies between these two points, the area of the full shape is then given by:

$$A= \frac{1}{9}+\frac{1}{6}\cdot\left(\frac{2\sqrt{2}}{3}-\frac{1}{3}\right)=\cdot\frac{1+2\sqrt{2}}{18}\approx 0.213$$

In the other possibility, \(A\) lies on a vertical line, while \(B\) lies on a horizontal line, or vice versa. Without loss of generality, let \(a_x=0\), and let \(b_y = 0\).Then the distance between \(A) and \(B\) gives the restriction that:

$$\begin{align*} 1 &= \sqrt{\left(a_x-b_x\right)^2+\left(a_y-b_y\right)^2}\\ 1 &= \sqrt{b_x^2+a_y^2}\\ 1 &= b_x^2+a_y^2\\ 1 - a_y^2 &= b_x^2\\ \sqrt{1 - a_y^2} &= b_x \end{align*}$$

And the total area of the shape is given by:

$$A= \frac{1}{9}+\frac{1}{6}b_x + \frac{1}{6}a_y=\frac{1}{9}+\frac{\sqrt{1 - a_y^2}}{6} + \frac{a_y}{6}$$

Once again, we find the minimum of this using first-term calculus. The derivative of this area is:

$$\frac{dA}{da_y}= \frac{ -a_y}{6\sqrt{1 - a_y^2}}+\frac{1}{6}$$

This derivative evaluates to 0 when

$$\begin{align*} 0 &= \frac{ -a_y}{6\sqrt{1 - a_y^2}}+\frac{1}{6}\\ \frac{a_y}{6\sqrt{1 - a_y^2}} &= \frac{1}{6}\\ a_y &= \sqrt{1 - a_y^2}\\ a_y^2 &= 1 - a_y^2\\ 2a_y^2 &= 1\\ a_y &= \frac{1}{\sqrt{2}} \end{align*}$$

If \(a_y=\frac{1}{\sqrt{2}}\), then \(b_x=\frac{1}{\sqrt{2}}\) as well. This solution is invalid, however, as the instance of \(L\) in this case would not pass through the instance of \(S\). Rather than showing this mathematically, I’ll just show the results of a graphing utility.

The non-differentiable points in the second possibility are identical to the non-differentiable points in the first possibility, proving the theorem.

\(\square\)

Admittedly, the statement of that theorem might have been highly intuitive to you. You might find it much less intuitive to consider elements of \(\left[G,V_{\frac{\pi}{6}}\right]^-\).

Theorem 8
All elements of \([E,G]^-\) are congruent to a triangle with base length 1 and height \(\frac{\sqrt{3}}{4}\), and have an area of \(\frac{\sqrt{3}}{8}\).

proof:

To start, let’s place an instance of \(E\) in the cartesian plane, and we’ll call this instance \(e\) (Euler’s number does not appear anywhere in this article). I’ll place this triangle with two points on the \(x\)-axis, and one point on the \(y\)-axis. Then the points in this triangle are:

$$e_1=\left(\frac{-1}{4},0\right)$$ $$e_2=\left(\frac{1}{4}\right)$$ $$e_3=\left(0,\frac{\sqrt{3}}{4}\right)$$

Note that it isn’t really important which two sides of this triangle make up \(e\), as we’ll be considering the convex hull, so all three sides are included in this graph. I will illustrate this triangle below:

Note that I have labeled the sections of this triangle with numbers 1 though 6. This is because the instance of \(G\) must pass through \(e\) by Theorem 4. Note that \(G\) cannot have endpoints in adjacent sections if it passes through \(e\). Note also that \(G\) cannot have both its endpoints in odd-numbered sections. Therefore, define the endpoints of \(G\) to be \(a=\left(a_x,a_y\right)\) and \(b=\left(b_x,b_y\right)\). Without loss of generality, assume \(a\) is in section 4, and assume \(b\) is in section 1 or 6.

What is the size of \(\left[e,\langle a,b\rangle\right]\)? First, Let’s find the area of \(\left[e,a\right]\). This consists of the area inside \(e\), as well as the area in the triangle defined by the points \(e_1\), \(e_2\), and \(a\).

The area of the grey region inside \(e\) is given by \(\frac{1}{2}\cdot\frac{1}{2}\cdot\frac{\sqrt{3}}{4}\). The area of the blue region created by \(a\) is given by \(\frac{1}{2}\cdot\frac{1}{2}\cdot a_y\). Therefore, the total area of these two regions is:

$$\frac{\sqrt{3}}{16}+\frac{a_y}{4}$$

Let’s start by assuming that \(b\) is located in region 6. In this case, the inclusion of \(b\) creates an additional triangle defined by the points \(e_3\), \(e_2\), and \(b\), as shown below:

The area of this green region is determined by the distance from \(b\) to the line from \(e_3\) to \(e_2\). There’s a useful formula that gives the area of a triangle given the locations of its three points:

$$\begin{align*} Area &=\left|{\frac{b_x\left(\frac{\sqrt{3}}{4}-0\right)+0\left(0-b_y\right)+\frac{1}{4}\left(b_y-\frac{\sqrt{3}}{4}\right)}{2}}\right|\\ &= \left|{\frac{\frac{b_x\sqrt{3}}{4}+\frac{1}{4}\left(b_y-\frac{\sqrt{3}}{4}\right)}{2}}\right|\\ &= \left|{\frac{b_x\sqrt{3}+b_y-\frac{\sqrt{3}}{4}}{8}}\right|\\ &= \frac{b_x\sqrt{3}}{8}+\frac{b_y}{8}-\frac{\sqrt{3}}{32} \end{align*}$$

Recall that \(b\) and \(a\) are exactly one unit apart. This gives the restriction that:

$$\begin{align*} 1 &= \left(b_y-a_y\right)^2 +\left(b_x-a_x\right)^2\\ 1-\left(b_x-a_x\right)^2 &= \left(b_y-a_y\right)^2 \\ \sqrt{1-\left(b_x-a_x\right)^2} &= b_y-a_y \\ \sqrt{1-\left(b_x-a_x\right)^2}+a_y &= b_y\\ \end{align*}$$

Substituting this into the equation of the area of the green region would give

$$Area = \frac{b_x\sqrt{3}}{8}+\frac{\sqrt{1-\left(b_x-a_x\right)^2}+a_y}{8}-\frac{\sqrt{3}}{32} $$

To find the minimum possible value of this area, we take the derivative with respect to \(b_x\) and set that derivative equal to 0.

$$\begin{align*} 0&= \frac{\sqrt{3}}{8}+\frac{-2\left(b_x-a_x\right)}{8\cdot 2\sqrt{1-\left(b_x-a_x\right)^2}}\\ 0&= \frac{\sqrt{3}}{8}-\frac{b_x-a_x}{8\sqrt{1-\left(b_x-a_x\right)^2}}\\ 0&= \sqrt{3}-\frac{b_x-a_x}{\sqrt{1-\left(b_x-a_x\right)^2}}\\ \sqrt{3} &= \frac{b_x-a_x}{\sqrt{1-\left(b_x-a_x\right)^2}}\\ \sqrt{3-3\left(b_x-a_x\right)^2} &= b_x-a_x\\ 3-3\left(b_x-a_x\right)^2 &= \left(b_x-a_x\right)^2\\ 3 &= 4\left(b_x-a_x\right)^2\\ \pm\frac{\sqrt{3}}{2} &= b_x-a_x\\ \end{align*}$$

Again, the distance between \(b\) and \(a\) must be exactly one unit. This means that:

$$\begin{align*} 1 &= \left(b_y-a_y\right)^2 +\left(\pm\frac{\sqrt{3}}{2}\right)^2\\ 1 &= \left(b_y-a_y\right)^2 + \frac{3}{4}\\ \frac{1}{4} &= \left(b_y-a_y\right)^2\\ \frac{1}{2} &= b_y-a_y\\ \end{align*}$$

Because \(b_x>0\), and \(b_y\) is so low above \(a_y\), we may assume that \(b= \left(a_x+\frac{\sqrt{3}}{2},a_y+\frac{1}{2}\right) \). This is signifigant, because we may observe that the slope from \(a\) to \(b\) is \(\frac{1}{\sqrt{3}}\). This is perpendicular to the slope from \(Z_3\) to \(Z_2\), meaning that this location of \(b\) maximizes the area of the green triangle.

The other posibility that would create a critical point for the area of the green triangle is that \(b\) lies on either the \(x\)-axis, or on the line \(y=\sqrt{3}x+\frac{\sqrt{3}}{4}\), as these are the bounds of \(b\)’s location. It is not possible for \(b\) to lie on the \(x\)-axis, as this would result in the line \(G\) lying outside \(e\). Therefore, the minimum area is achieved when \(b\) lies on \(y=\sqrt{3}x+\frac{\sqrt{3}}{4}\).

The other posibility is that \(b\) lies in section 1. If this is the case, we may ignore the area of \(e\), as the overall area can now be defined by a triangle between \(b\), \(e_1\), and \(e_2\).

In this case, the total area of \([e,\langle a,b\rangle]\) is \(\frac{1}{4}a_y+\frac{1}{4}b_y\). Holding the position of \(a\) constant, \(b_y\) is minimized by maximizing \(\left|\left(b_x-a_x\right)\right|\). This means that \(b\) must lie on the boundary of Section 1.

We have concluded that \(b\) must lie on the boundary of section 1, regardless of what other assumptions have been made. Let’s assume, without loss of generality, that \(b\) lies on the line \(y=\sqrt{3}x+\frac{\sqrt{3}}{4}\), while \(a\) is in section 4. From here, we’ll hold the location of \(b\) constant and find the optimal location of \(a\). As before, to minimize the area of the blue triangle formed by \(a\), we must minimize \(a_y\), which means maximizing \(\left|\left(b_x-a_x\right)\right|\). We conclude that \(a\) must lie on the boundary of section 1. If \(a\) lies on the line passing through \(e_3\) and \(e_2\), then \(\langle a,b\rangle\) does not intersect \(e\).

We have now shown that \([e, \langle a,b\rangle]\) is congruent to a triangle with base length 1 and height \(\frac{\sqrt{3}}{4}\). The area of such a triangle is \(\frac{\sqrt{3}}{4}\).

\(\square\)

Some Difficulties in Finding Minimal Covers
#

There are two ideas that I imediately had when considering how to find minimal covers. Neither of them are effective, but it’s informative to see how these strategies can fail. The first would look like this:

Theorem 9
Let \(A\) and \(B\) be curves, and let \(a\) and \(b\) be instances of those curves. Additionally, let \(a’\) be such that \([a’,b] \in [A,B]^-\), while \([a,b] \not\in [A,B]^-\). There does not nececarily exist a linear transformation \(T:a\rightarrow a’\) such that the area of \([T^t(a),b]\) is monotonically decreasing on \(t\in[0,1]\).

Another way of phrasing this is that there might not be a way to smoothly slide a shape into position to minimize the size of the complex hull without first increasing the size of the complex hull. A third way of phrasing it is that there may be local minima in the size of the convex hull of two shapes that is not an absolute minimum.

proof

Let \(b\) be a rectangle defined by the points \(0,0\), \(0,\frac{1}{4}\), \(\frac{1}{2},\frac{1}{4}\), and \(\frac{1}{2},0\). Consider the convex hull of this rectangle and a unit length line \(A\). It was shown in Theorem 7 that the optimal locations for this line are such that its endpoints lie on opposite parallel extensions of the sides of \(b\). This means that the endpoints of \(a\) either lie on \(x=0\) and \(x=\frac{1}{2}\), or they lie on \(y=0\) and \(y=\frac{1}{4}\). Without loss of generality, assume one endpoint of \(a\) lies at the origin. Because the other endpoint of \(a\) must lie a unit away from the origin, the two possible locations are \(\left(\frac{1}{2},\frac{\sqrt{3}}{2}\right)\) and \(\left(\frac{\sqrt{15}}{4},\frac{1}{4}\right)\), as shown below:

The area of the box on its own is \(\frac{1}{2}\cdot\frac{1}{4} = \frac{1}{8}\). The total area of \([a,b]\) is:

$$\begin{align*} \frac{1}{8} + \frac{1}{2}\cdot\frac{1}{2}\cdot\left(\frac{\sqrt{3}}{2} - \frac{1}{4}\right) = \frac{2}{16} + \frac{1}{4}\cdot\frac{2\sqrt{3}-1}{4} = \frac{2\sqrt{3}+1}{16}\approx 0.279 & \text{ if an endpoint lies at } \left(\frac{1}{2},\frac{\sqrt{3}}{2}\right) \\ \frac{1}{8} + \frac{1}{2}\cdot\frac{1}{4}\cdot\left(\frac{\sqrt{15}}{4} - \frac{1}{2}\right) = \frac{4}{32} + \frac{1}{8}\cdot\frac{\sqrt{15}-2}{4} = \frac{\sqrt{15}+2}{32}\approx 0.184 & \text{ if an endpoint lies at } \left(\frac{\sqrt{15}}{4},\frac{1}{4}\right) \\ \end{align*}$$

Suppose \(a\) is located with an endpoint at \(\left(\frac{1}{2},\frac{\sqrt{3}}{2}\right)\). A transformation that brings \(a\) to \(\left(\frac{\sqrt{15}}{4},\frac{1}{4}\right)\) would have to first bring it through the area above \(y=\frac{1}{4}\) and to the right of \(x=\frac{1}{2}\), which would increase the size of \([a,b]\).

\(\square\)

What a shame! I was hoping that it would be possible to smoothly shift shapes to find their best convex hulls. An even bigger issue came when I realized this:

Theorem 10
Let \(A\), \(B\), and \(C\) be curves. It is not necesarily the case that there exists an element of \(\left[A, B, C\right]^-\) accomodates an element of \(\left[A, B\right]^-\).

In other words, it doesn’t work to find the minimal shell of two shapes and then find the minimal shell of that with a third shape. You have to consider all three shapes at the same time.

proof

Consider the set of curves consisting of the unit length line segment, the equilateral triangle with sides of length \(\frac{1}{2}\), and the square with sides of length \(\frac{1}{3}\), or \({G, E, S}\). To cause a contradiction, let \([s,g]\) be an instance of \([S,G]^-\) and let \([g’,e]\) be an instance of \([G, E]^-\), so that

$$[s,g,g’,e]\in[S, G, E]^-$$

Immediately, we have that \([s,g,e]\) is not larger than \([s,g,g’,e]\). Because \([s,g,g’,e]\in[S, G, E]^-\), we have that \([s,g,e]\) cannot be smaller than \([s,g,g’,e]\), and so they must be the same size. By the same reasoning, \([s,g’,e]\) must be the same size as \([s,g,g’,e]\) as well. This means that \(g\) or \(g’\) can be removed from \([s,g,g’,e]\) without shrinking it, but not both. Therefore, \(g\) and \(g’\) must share at least one endpoint on the boundary of \([s,g,g’,e]\). Without loss of generality, let this endpoint be the origin.

First, let’s assume that \(g = g’\). Without loss of generality, we can let \(g\) lie on the \(x\)-axis, with \(e\) lying above the \(x\) axis. We know that \(e\) has a height of \(\frac{\sqrt{3}}{4}\), and a base length of \(\frac{1}{2}\), meaning that \([g,e]\) consists of a triangle between the points \((0,0)\), \((1,0)\), and \((a,\frac{\sqrt{3}}{4})\), with \(a\in[\frac{1}{4},\frac{3}{4}]\).

Without loss of generality, we’ll also assume that the location of \(s\) is such that the lines passing through the endpoints of \(g\) have positive slope. Then the sides of \(\) are defined by the equations:

$$y=\frac{\sqrt{2}}{4}x$$ $$y=\frac{\sqrt{2}}{4}(-1)$$ $$y=-2\sqrt{2}x+b$$ $$y=-2\sqrt{2}x+1+b$$

Where \(b\in[0,2\sqrt{2}-1]\). This is shown below:

The bottom point of this square is located at the intersection of \(y=-2\sqrt{2}x+b\) and \(y=\frac{1}{2\sqrt{2}}\left(x-1\right)\). This point is located at:

$$\left(\frac{2\sqrt{2}b+1}{9}, \frac{b-2\sqrt{2}}{9}\right)$$

This point forms a triangle with the instance of \(g\) on the x axis. That triangle has a size of

$$\frac{b-2\sqrt{2}}{18}$$

There is also a triangle formed by the top left point of \(s\), whenever this point is not inside \([g,e]\). This point is located at

$$\left(\frac{\sqrt{8}\left(1+b\right)}{9},\frac{1+b}{9}\right)$$

This triangle only exists when the top right point of \(s\) is above the top left line of \([e,g]\). This line is determined by

$$y=\frac{\sqrt{3}}{4a-4}\left(x-1\right)$$

Therefore, if this triangle exists, we may assume $$\begin{align*} \frac{1+b}{9} &\geq \frac{\sqrt{3}}{4a-4}\left(\frac{2\sqrt{2}\left(1+b\right)}{9}-1\right)\\ \left(1+b\right)\left(4a-4\right) &\geq\left(2\sqrt{2}\left(1+b\right)-9\right)\sqrt{3}\\ \end{align*}$$

If this is satisfied, there is a triangle with the point of \(g\) located at \(\left(1,0\right)\), and the point of \(e\) located at \(\left(a, \frac{\sqrt{3}}{4}\right)\) with \(a\in\left[\frac{1}{4},\frac{3}{4}\right]\). Using the shoelace formula, the area of this triangle is given by

$$\begin{align*} &\frac{1}{2}\left|\left(1-\frac{2\sqrt{2}\left(1+b\right)}{9}\right)\left(\frac{\sqrt{3}}{4}-0\right)-\left(1-a\right)\left(\frac{1+b}{9}-0\right)\right|\\ &\frac{1}{72}\left|9\sqrt{3}-2\sqrt{6}\left(1+b\right)+4a-4+4ab-4b\right|\\ \end{align*}$$

Because we are assuming that this triangle exists, this may be rewritten as

$$\frac{1}{72}\left(\left(2\sqrt{2}\left(1+b\right)-9\right)\left(\sqrt{3}\right)-\left(4a-4\right)\left(1+b\right)\right)$$

Therefore, it is ideal to maximize \(a\), so let \(a=\frac{3}{4}\). The equation for area then simplifies to

$$\frac{1}{72}\left(\left(2\sqrt{6}+1\right)\left(1+b\right)-9\sqrt{3}\right)$$

This is a linear equation in which area grows with \(b\) proportionally to \(\frac{2\sqrt{6}+1}{72}\aprox 0.0819\). This is quicker growth than the growth of the bottom triangle, which grows proportianally to \(\frac{1}{18}\aprox 0.055\). We may conclude that \(b\) must be such that

$$\left(1+b\right)\left(4a-4\right) =\left(2\sqrt{2}\left(1+b\right)-9\right)\sqrt{3}$$

$$\begin{align*} \left(1+b\right)\left(4a-4\right) &=\left(2\sqrt{2}\left(1+b\right)-9\right)\sqrt{3}\\ \left(1+b\right) &= 9\sqrt{3}-2\sqrt{6}\left(1+b\right)\\ b &=\frac{9\sqrt{3}}{2\sqrt{6}+1}-1\\ \end{align*}$$

This is shown below:

It is evident that \(s\) can be translated linearly along the line \(y=\frac{\sqrt{3}}{4a-4}\left(x-1\right)\) to decrease the overall size of \([e,g,s]\), meaning that \([e,g,s]\not\in[E,G,S]^-\).

So suppose instead that \(g\neq g’\), and that

\(\square\)

In a certain sense, this is absolutely disasterous.

Proxy Accommodated Curves
#

As mentioned, one of the big hopes in this project is that we should be able to ignore certain curves because they are accomodated by the unions of other curves. For example, let’s look at what else might be covered by one of the elements of \([E,G]^-\).

Theorem 12
The triangle consisting of points at \(\left(\frac{-1}{2},0\right)\), \(\left(\frac{1}{2},0\right)\), and \(\left(0,\frac{\sqrt{3}}{4}\right)\) acommodates \(V_\theta\) for any value of \(\theta\), and \(V_\theta\) for any value of \(\theta\).
proof

Let \(t\) be the triangle described in the theorem statement.

For any angle \(\theta\), define \(v_\theta\) to contain the points \(v_1=\left(0,\frac{1}{2}\cos{\frac{\theta}{2}}\right)\), \(v_2=\left(\frac{1}{2}\sin{\frac{\theta}{2}},0\right)\), and \(v_3=\left(-\frac{1}{2}\sin{\frac{\theta}{2}},0\right)\). Note first that the distance between \(v_1\) and \(v_2\) is:

$$\sqrt{\left(\frac{1}{2}\cos{\frac{\theta}{2}}\right)^2+\left(-\frac{1}{2}\sin{\frac{\theta}{2}}\right)^2}=\frac{1}{2}\sqrt{\cos^2{\frac{\theta}{2}}+\sin^2{\frac{\theta}{2}}}=1$$

By symmetry, the distance from The angle formed by this shape is \(\frac{1}{2}\). The angle formed at \(v_1\) is \(\theta\), confirming that this is an instance of \(V_\theta\). Note that \(\frac{1}{2}\sin{\frac{\theta}{2}} < \frac{1}{2}\) for all values of \(\theta\), meaning that \(v_2\) lies inside \(t\). By symmetry, \(v_2\) also lies inside \(t\).

\(v_1\) lies inside \(t\) whenever \(\frac{1}{2}\cos{\frac{\theta}{2}}<\frac{\sqrt{3}}{4}\). This is true when \(\theta<\frac{pi}{6}\). Since \(V_\theta\) is not taut for \(\theta < \frac{\pi}{6}\), we won’t consider these anyway.

For any radius \(r\), let \(a_r\) be the curve defined with \(y=\sqrt{r^{2}-x^{2}}-r\sin\left(\frac{\pi r-1}{2r}\right)\), where \(y>0\). The line that defines the right half of \(t\) is \(y=-\frac{\sqrt{3}}{2}x+\frac{\sqrt{3}}{4}\). These two lines intersect when

$$\begin{align*} \sqrt{r^{2}-x^{2}}-r\sin\left(\frac{\pi r-1}{2r}\right) &= -\frac{\sqrt{3}}{2}x+\frac{\sqrt{3}}{4}\\ \sqrt{r^{2}-x^{2}}-r\sin\left(\frac{\pi r-1}{2r}\right) &= \frac{\sqrt{3}}{4}(1-2x)\\ \sqrt{r^{2}-x^{2}} &= \frac{\sqrt{3}}{4}(1-2x)+r\sin\left(\frac{\pi r-1}{2r}\right)\\ r^{2}-x^{2} &= \frac{3}{16}(1-2x)^2+\frac{\sqrt{3}}{2}(1-2x)r\sin\left(\frac{\pi r-1}{2r}\right)+r^2\sin^2\left(\frac{\pi r-1}{2r}\right)\\ r^{2}-x^{2} &= \frac{3}{16}-\frac{3}{8}x+\frac{3}{4}x^2+\frac{\sqrt{3}r}{2}(1-2x)\sin\left(\frac{\pi r-1}{2r}\right)+r^2\sin^2\left(\frac{\pi r-1}{2r}\right)\\ r^2 &= \frac{3}{16}-\frac{3}{8}x+\frac{7}{4}x^2+\frac{\sqrt{3}r}{2}(1-2x)\sin\left(\frac{\pi r-1}{2r}\right)+r^2\frac{1-\cos\left(\frac{\pi r-1}{r}\right)}{2}\\ 0 &= \frac{3}{8}-\frac{3}{4}x+\frac{7}{2}x^2+\sqrt{3}r(1-2x)\sin\left(\frac{\pi r-1}{2r}\right)-r^2-r^2\cos\left(\frac{\pi r-1}{r}\right)\\ \end{align*}$$

\(\square\)

Eliminating Type 2 Curves (hopefully)
#

This article is getting quite long! For fear of making something that’s difficult to read, and difficult for webpages to load, I’ve decided to break the second half of this project into its own article. It’ll be nice to be able to break the project down into discrete sections. Continue reading here!