HowdyTX

UNIX History & the `dc` Calculator

Ethan Barry — Fri, 23 Aug 2024 21:20:16 GMT

"...an elegant weapon for a more civilized age."

Or, why RPN is the coolest thing since sliced silicon.

When you need to do a bit of math, like most people, you probably reach for your phone's built-in calculator. If you're on a computer, you might bust open the Windows calculator, or any of the Linux equivalents. Perhaps, if you're in a terminal, you do something like:

$ python
Python 3.12.4 (main, Jun  7 2024, 06:33:07) [GCC 14.1.1 20240522] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 39480 * 4
157920
>>>

You may even know that there's a dedicated UNIX utility, bc, which can do the exact same thing, and can run interactively, or with piped input, or a script. But have you ever tried dc? Maybe your fingers slipped one day, and you accidentally typed dc instead of cd, as I did. In that case, like me, you were presented with a simple blank line—not a prompt, not a copyright message, but just a blank line.

If you had that experience, you may have typed man dc out of curiosity (tldrdc if you were in a hurry). In that case, you'd see this:

dc(1)                                                                 
General Commands Manual                                                     
            dc(1)

NAME
       dc - an arbitrary precision calculator

SYNOPSIS
       dc [-V] [--version] [-h] [--help]
          [-e scriptexpression] [--expression=scriptexpression]
          [-f scriptfile] [--file=scriptfile]
          [file ...]

DESCRIPTION
       dc is a reverse-polish desk calculator which supports unlimited
precision arithmetic.  It also allows you to define and call macros.  
Normally dc reads from the  standard  input;  if any command arguments are 
given to it, they are filenames, and dc reads and executes the contents of 
the files before reading from standard input.  All normal output is to 
standard output; all error output is to standard error.

       A reverse-polish calculator stores numbers on a stack.  Entering a 
number pushes it on the stack.  Arithmetic operations pop arguments  off  
the  stack  and push the results.

       To enter a number in dc, type the digits (using upper case letters A 
through F as "digits" when working with input bases greater than ten), with 
an optional  decimal  point.  Exponential notation is not supported.  To 
enter a negative number, begin the number with ‘‘_''.  ‘‘-'' cannot be used 
for this, as it is a binary operator for subtraction instead.  To enter two 
numbers in succession, separate them with spaces or newlines.  These have 
no meaning as commands.

Well, that seems redundant. Just how many calculators does one need? My dear reader, the correct answer is "all of them." But what makes this calculator special, and what is all that about "reverse-polish calculators" and "stacks?"

To answer that, we need to examine the usual way of writing math. For example, when you want to operate on a couple of numbers, you typically write the operator between them, like this: \(1 + 2 = 3\). But there are other ways. There's function notation, which might look like \(\mathrm{add}(1, 2) = 3\). You could even have a lambda calculus notation, e.g. \(\mathrm{add}\,m\,n = \lambda s.\,\lambda z.\,m\,s\,(n\,s\,z)\) where \(1 + 2 = 3\) would be written \(5 = \mathrm{add}\,2\,3 = \lambda s.\, \lambda z.\,2\,s\,(3\,s\,z)\).

You could also have prefix, or Polish notation, where addition is written \(+\,3\,4\). This is really recognizable to lisp users. Reverse Polish notation, or postfix notation, is the opposite. It has operators following operands, like so: \(3\,4\,+\). You may wonder why on earth anyone would write math that way, but there is a good technical reason. Reverse Polish notation, or RPN, removes the need for operator precedence.

If you write a naïve program reading symbols and doing operations for infix notation, you'll quickly wind up tripping over expressions like \(3 + 4 \times 9 + 2\). The correct answer is \(41\). A simple interpreter would evaluate from left to right, and wind up with 63. To correctly apply the order of operations with infix notation, you would need to include either a parser that builds and evaluates an expression tree, or parentheses to specify the order. In contrast, the RPN equivalent of \(4\,9\,\times\,2\,3\,+\,+\) is easy to parse. Here's why.

Parsers aren't hard to implement for simple infix notation, but right-associative operators like exponentiation can be more difficult. In any case, a parse tree needs to be built somehow, and the tree needs to be simplified from its leaves downwards to its root. But with postfix operators, the only data structure you need is a stack.

The process is simple. Push numbers onto the stack. When you encounter a binary operator, pop two and operate, else, pop one. Push the result back onto the stack. At the end, the answer will be the only number left.

The History

This is a very simple algorithm, and it needs only very simple tools. Perhaps this is why dc, the UNIX RPN calculator, was the first utility ported to UNIX, way back in the PDP-11 days. It even predates the C programming language; originally it was implemented in B. This program is ancient in computing terms—right up there with dinosaurs, magnetic tape, and punch cards. Wikipedia calls it the "oldest surviving UNIX program."

RPN isn't unique to dc. The first recognizable electronic (or electromechanical) computer, Konrad Zuse's Z3, used RPN as its input scheme in 1941. Since that time, all manner of calculators and computers, desktop, handheld, or otherwise, have used RPN as the input scheme because it is efficient and unambiguous. Hewlett-Packard's calculators notably used RPN for every field, from accounting to engineering. You may know the venerable HP-15C, which offered everything from matrix operations to numerical integration, all with RPN on 1980s hardware.

Ah, simplicity...

Now

Just for fun, yesterday afternoon I implemented a little dc-like calculator in Rust. It doesn't have registers or support for macros yet, but it does implement addition/subtraction, multiplication/division, exponentiation, and square and cube roots. The arbitrary precision is supplied by the bigdecimal crate. I thought about using Malachite, but neither it nor num support taking roots. They want their operations to remain closed on the ring ℚ, and roots are a great way to get irrational numbers!

You can find the source code here. Just clone the repo and do:

cargo run

to start the program. Then, type in your RPN expressions. You can print the entire stack with "p," clear it with "c," print the top item with "t," and take square and cube roots with "v" and "b," respectively.

I'm especially proud of my little prompt macro, which is the first Rust macro I've written. I never understood why something like this wasn't included in the standard library, but here you go, for what it's worth:

/// A simple macro to grab input from the user.
#[macro_export]
macro_rules! prompt {
    ($prompt:expr) => {{
        print!("{}", $prompt);
        use std::io::Write;
        std::io::stdout()
            .flush()
            .expect("Writing to stdout failed!");
        let mut input = String::new();
        match std::io::stdin().read_line(&mut input) {
            Ok(0) => std::process::exit(0), // Handle EOF
            _ => {}
        };
        input.trim().to_string()
    }};
}

You use it like this, which I think is pretty handy.

let input = prompt!("Enter a number> ");

Thanks for reading! I'd appreciate your comments. Please, connect with me on LinkedIn, or email me at ethan.barry@howdytx.technology.

Creating a LaTeX Template for UT Tyler

Ethan Barry — Thu, 20 Jun 2024 11:46:37 GMT

Because everyone needs more LaTeX in their life.

At least, the easy-to-use kind.

Most folks in STEM fields know about \(\LaTeX\). It's the gold standard for scientific writing, desktop publishing, and math typesetting. Its algorithms still beat MS Word's, Google Docs', or even Adobe's InDesign's. (If you haven't heard of \(\LaTeX\), it's a way to make PDFs with code.)

The website Overleaf.com is an easy-to-use experience for anyone who would rather sit down and write than fiddle with \(\LaTeX\) installation on, say, a Windows computer. Because of that, many thousands of students, academics, and professionals worldwide use it daily to typeset documents. The platform has a huge collection of templates, which can provide your document's formatting and looks.

The Problem

Half the problem is that most of the templates are boring, and only some are exceptional. Of course, any one you choose will produce a better document than MS Word would give you. The issue is that most users never tap into the superpowers of \(\LaTeX\). This is the fault of something called the "document class." A document class defines macros (shortcuts) for the user to set text with. The document classes everyone uses (article, book, &c.) are old, try to be as backwards-compatible as possible, and break easily if you try to be too cute.

Ask me how I know.

Anyway, for a new document, people should ideally be using the memoir class. It is a modern class (from 2001!) which is entirely dedicated to ergonomics. The original developer wanted users to create professional documents without knowing anything about typography. It provides a whole host of deeply integrated functionality that article and book can only dream of.

Side Note: The class has 400 pages of documentation, 500 pages in the user manual, and even comes with a 130 pages of typographical notes by the developer. You will not lack for help resources if you adopt memoir.

Having a general overabundance of okay-ish templates wouldn't be a problem on its own. The other half of the problem is that many universities provide \(\LaTeX\) templates, while mine does not! So, can we create a decent template using memoir that fixes UT Tyler's lack of an Overleaf template?

The Solution

We have a default article document that looks like this:

Icky article class with lorem ipsum text.

But, we want to turn it into something pretty. If we simply change from article to memoir as our document class, very little happens. We need to do more. I'll walk through the changes I made, and a beginner-level \(\LaTeX\) user should have no problem following along in the code, available here.

Here is a page from the final result:

My template, showcasing some math and graphics typesetting.

So, what changed?

Visible Changes

First, the font. There's nothing wrong with the default latin-modern font, but it is a bit spindly at larger sizes, and we want something classic. You'll find there is no better option than EB Garamond. If you haven't overridden my CSS, this blog is set in EB Garamond as well. You can insert \usepackage{ebgaramond} to set it as the main font. I've included a package that sets it as the math font, too. Notice how nice the Schrödinger equation looks!

Second, the margins are different—I used a wider right margin so I can insert margin notes—which gives the document a more book-like feel. I want to use margin notes to introduce variables. I saw this done in an older journal, and I think it's a very handy idea. It's a shame we don't see it more often. Notice how \(x\) and \(z\) are shown below. If you were skimming the paper for some variable's definition, you could easily find it!

A showcase of what margins and pretty math fonts can do.

Boldface is used sparingly, with italics and smallcaps (\scshape) providing most of the emphasis. Bold faces are not a major element of classic typography, so it's nice to minimize their use. It goes a long way to making the document more readable.

Finally, my favorite part is the title page and frontmatter. I'll let the images speak for themselves.

The title page, showing the university seal in all its glory...

The copyright page, showing nice spacings and beautiful small caps that only EB Garamond can provide.

The title page is inspired by this template by Mario Vilar on Overleaf, though I didn't use any code from it in this document. I especially like his faint background. It's similar to my research paper's title page here.

Changes "Under the Hood"

The code itself is fanatically organized and heavily commented. I want it to be as easy as possible to do whatever needs to be done. It has sane defaults for document divisions and page styles, with most of the options visible in the preamble.

You can explore it for yourself on GitHub or Overleaf.

Conclusion

I am excited to see whether this gets any notice on Overleaf from my fellow students, but regardless of whether it does, I want to use it for my own papers. Hopefully none of the professors will be too draconian about formatting rules in the coming semesters.

As for extensions to the template, I'm adding some macros for margin notes and variables in the margins. I realized after I took the screenshots that the margin notes should be set \raggedright instead of justified. I think a macro that takes the note as an argument and automatically formats it will be useful.

As always, please send me any feedback you have at ethan.barry@howdytx.technology, or drop me a message on LinkedIn!

The Best Possible Quadrature Routine (Within Reason)

Ethan Barry — Mon, 10 Jun 2024 18:15:15 GMT

Gauss-Lobatto-Kronrod Quadratures

...in Rust, of course.

Originally all I wanted to do with numerical analysis was string together some functions and plot an ephemeris, given the right data. Planetary motions are fascinating, and I think it will be a really cool project. However, after reading the first chapter of the textbook Celestial Mechanics, where the author lists important numerical methods, I realized that numerical analysis by itself is something interesting.

In the last section of the first chapter he gives a list of programs or routines that are useful to him. One of those was using Gaussian quadrature to compute a definite integral. I already discussed the problem of quadrature in the post on Simpson's Rule, and I mentioned Gaussian quadrature specifically in the post on Lagrange interpolation. But, it's still worth revisiting the question of what quadrature means, before I share what I learned about Gaussian quadrature.

Computing Definite Integrals

In calculus we learn that a definite integral is actually the area beneath a curve. We approximate this with a sum of the areas of many small slices. In fact, the exact value is the number the sum approaches as the number of slices goes to infinity. This is called the Riemann sum approach, and essentially, it chops the integral up in nice rectangular bits. If we wanted to approximate an integral, one way would be to chop it up into lots of rectangles, and add their areas. If we want a closer approximation, then we just make finer slices.

This is certainly a fine method, but it would be a nightmare to compute by hand. Every slice needs its own function evaluation. Even in a computer, this can be slow. And, since we would be using rectangles to approximate a smooth function, this would still be inaccurate.

If, instead of using a flat-topped rectangle, we used a slope-topped trapezoid as our slice shape, we could make the slope line up with the function. That would mean our calculation would be more accurate, and would require fewer function evaluations. This is the trapezoid rule.

And, if we use a quadratic curve as the top of the slice, it can fit the function even more tightly still. That was Simpson's Rule, from my earlier post. In fact, there's a whole class of quadrature rules called Newton-Cotes formulas, which extend this idea, using higher-order polynomials to interpolate the points.

With these methods, we have one degree of freedom. We get to choose the weighting coefficients in equations like Simpson's Rule:

\[\int_a^b f(x)\,dx \approx \frac{1}{3}(f(a) + 4f(c) + f(b))\]

The big idea behind Gaussian quadrature is that it allows us to choose both the weights and the locations at which we evaluate \(f(x)\), giving us two degrees of freedom. With Newton-Cotes formulas, our accuracy increased at \(\mathcal{O}(n)\), as we added more slices. Now, we can converge to the result at \(\mathcal{O}(n^2)\), with the extra degree of freedom.

Let's examine the main idea.

Gaussian Quadrature

Simpson's Rule and the rest of the Newton-Cotes formulas essentially represent the integrand with polynomials. Gaussian quadrature lets us represent our integrand as a polynomial times a weight function. This is good, because some functions aren't nicely interpolated by polynomials. We can also evaluate at different points by applying weights to the evaluations.

This gives us the general Gaussian quadrature rule

\[\int_{-1}^1 W(x)f(x)\,dx \approx \sum_{1=0}^n w_if(x_i)\]

The polynomials we need to use are called Legendre polynomials. So, in our equation, \(n\) is the number of sample points, \(w_i\) are the weights, and \(x_i\) are the roots of the \(n\)th Legendre polynomial. The integral is given on the interval \([-1,1]\) because that is where Legendre polynomials have most of their useful properties, but we can scale up or down to any interval \([a,b]\).

We can compute the roots on the fly for any order \(n\), within reason. There is a technique called the Newton-Raphson method which finds roots of nonlinear functions. It takes a starting guess and iteratively improves it until the difference between the new value and the old one is within tolerance. It's a surprisingly simple technique.

\[x_{new} = x_{old} - \frac{f(x_{old})}{f'(x_{old})}\]

Now we just need the polynomials! There's a recurrence relation for them, so starting with the first two, we can build all the Legendre polynomials after that.

\[\begin{align}p_{-1}(x) &= 0\\ p_0(x) &= 1\\ p_{j+1}(x) &= (x-a_j)p_j(x) - b_jp_{j-1}(x),\quad j = 1,2,...,n\end{align}\]

We need a good guess to use Newton-Raphson. Usually the roots are closely approximated by \[r=\cos\left(\pi\frac{i + \frac{3}{4}}{n + \frac{1}{2}}\right)\quad i = 0,...,\frac{(n+1)}{2}\] where \(n\) is the order of the quadrature.

Once we have the weights \(w_i\), given by \[w_i = \frac{2}{(1-x_i^2)\left[p_n'(x_i)\right]^2}\] where \(p_n'\) is the derivative of the \(n\)th Legendre polynomial. Finally, we can mix all these numbers together in some code. But first, I want to show a result this algorithm gives.

Here's a test problem with the correct approximation to 9 decimal places: \[\int_0^1 \frac{ x^4 }{ \sqrt{ 2(1+x^2) }}\,dx \approx 0.108\,709\,465\,052\,586\,44\] The Rust code I wrote returns this value to nine decimal places with a 7-point Gauss-Legendre quadrature. That means the function only had to be evaluated seven times. I also used Simpson's Rule with seven evaluations, and it gave me an answer of 0.108 710, which is not bad.

Now, if I raise the Gaussian quadrature's order to eight, I get the answer 0.108 709 465 037, which is correct to 10 decimal places. If I raise the Simpson's Rule order to eight as well, I get 0.108 709 862, which is still only correct to six decimal places.

The code is below.

pub fn gaussian_quad(f: F, a: f64, b: f64, n: u32) -> f64
where
    F: Fn(f64) -> f64,
{
    const NEWTON_PRECISION: f64 = 1.0e-10;
    let mut res = 0.0;
    // Scale the bounds of integration a and b to [-1, 1]
    // using the change of variables
    // x = st + c for t ∈ [-1, 1]
    // where
    // s = (b - a) / 2
    // and
    // c = (b + a) / 2.
    let s = (b - a) / 2.0;
    let c = (b + a) / 2.0;
    let m = (n + 1) / 2;
    let mut roots = vec![0.0; n as usize];
    let mut weights = vec![0.0; n as usize];
    for i in 0..m {
        // Approximate the ith root of this Legendre polynomial.
        let mut z = f64::cos(consts::PI * (i as f64 + 0.75) / (n as f64 + 0.5));
        let mut z1 = z - 1.0; // Ensure an initial difference.
        let mut p_prime = 0.0;
        while (z - z1).abs() > NEWTON_PRECISION {
            let mut p1 = 1.0;
            let mut p2 = 0.0;
            // Build the Legendre polynomial using the recurrence relation here...
            for j in 0..n {
                let p3 = p2;
                p2 = p1;
                p1 = ((2.0 * j as f64 + 1.0) * z * p2 - j as f64 * p3) / (j + 1) as f64;
            }
            // p1 is now the desired Legendre polynomial.
            // let p_prime be its derivative, computed by a known relation.
            p_prime = n as f64 * (z * p1 - p2) / (z * z - 1.0);
            z1 = z;
            z = z1 - p1 / p_prime; // Newton-Raphson method.
        }
        // z is now the ith root of the polynomial.
        roots[i as usize] = z;
        roots[(n - 1 - i) as usize] = -z;
        weights[i as usize] = 2.0 / ((1.0 - z * z) * p_prime * p_prime);
        weights[(n - 1 - i) as usize] = weights[i as usize];
    }

    for i in 0..n as usize {
        res += weights[i] * f(s * roots[i] + c);
    }
    res * s // Scale back to normal size.
}

Now, if we have a function that is well-behaved, this will work nicely. But if there's an irregularity in just one or two places, we have to raise the number of points just to cover those two subintervals. Wouldn't it be nice if we could somehow adapt to a problem like that, and only do more evaluations when we need them?

Adaptive Quadrature

Remember that the heart of numerical integration is chopping area into smaller pieces. If there's an interval where the function goes crazy, and one where it's smooth, then let's not worry about the smooth interval too much, and focus on the crazy one. How do we measure "craziness?" The general method is to evaluate the integral on the interval twice—once at low precision, then again at high precision—which gives us a difference. If that difference is within a tolerance, then we leave it alone. If not, we split the interval and repeat the process.

You might think that this will cost us a lot of function evaluations. It will take more, but maybe we can minimize the number needed. And an adaptive routine is already saving us evaluations in comparison to just globally increasing evaluations to boost accuracy, like with Simpson's Rule.

With Gaussian quadrature, a low-precision rule will not have the same \(x\)-coordinates as a high-precision one. That means we can't reuse any of those function values. What we're looking for is a nested quadrature, which "nests" a higher-order quadrature inside a lower-order one. There are two options, Clenshaw-Curtis, and Gauss-Kronrod. Both were invented around 1960; the first by two Western (British, I think) mathematicians, and the second by a Soviet mathematician, Aleksandr Semyonovich Kronrod.

This particular algorithm, from the paper "Adaptive Quadrature—Revisited" by W. Gander and W. Gautschi, is an example of Gauss-Lobatto-Kronrod quadrature. Kronrod's extension of Gaussian quadrature allows us to reuse function evaluations for higher precision, and the "Lobatto" part means we use the endpoints as well.

A Kronrod Rule uses \(2n+1\) points, where \(n\) is the number of points for the Gaussian quadrature. This makes the Kronrod Rule exact for a polynomial up to degree \(3n+1\). The extra \(x\)-values (abscissas) are nowadays calculated with some linear algebra tricks, as described in the paper. They can also be found by solving non-linear equations with Newton's Method, but because either one would be expensive, we'll use tabulated values.

As an example of what Gaussian quadrature and the Kronrod extension are really doing, I produced a little graphic with gnuplot. Here is a 4-point Gauss-Lobatto quadrature of our example \(\int_0^1 \frac{ x^4 }{ \sqrt{ 2(1+x^2) }}\,dx\). The lines pass through the \(y\)-values of the Gaussian abscissas, so the abscissas lie below the intersection of the line and the curve. The Kronrod abscissas would occur somewhere between the Gaussian ones.

Function plot showing where the Gaussian abscissas land.

The function I created has a pretty simple interface. I didn't include controls on how many iterations to allow, or what precision to calculate the integral to. It simply computes until the answer is stable to within machine precision.

pub fn gaussian_kronrod_quad(func: F, a: f64, b: f64) -> f64
where
    F: Fn(f64) -> f64,
{
/* SNIP; CODE BELOW */
}

This has given very good results on the integrals I've tested. For example, on our problem from earlier, \[\int_0^1 \frac{ x^4 }{ \sqrt{ 2(1+x^2) }}\,dx \approx 0.108\,709\,465\,052\,586\,44\] I get the answer to the ~17 digits allowed by a double precision float f64.

The basic steps are:

Evaluate a 4-point Gauss-Lobatto quadrature on the scaled interval.
Evaluate a 7-point Kronrod extension using the 4-point values.
Evaluate a general estimate for the entire integral with a 13-point Kronrod extension using stored values from the previous two.
If the 4- and 7-point values differ by more than machine precision, split the interval into six parts, and perform the same 4- & 7-point procedure on those. Add their results.

This is a very good quadrature method, comparable to Julia's QuadGK.jl package, or some tools from other dedicated systems, like GNU OCTAVE and MATLAB. In fact, in the paper linked above, the authors show that it was actually an improvement over the quadatures available at the time in MATLAB, and the NAG and IMSL libraries.

Conclusion

The speed and type safety of Rust make it a very attractive option for scientific computing, but so far, it hasn't made much headway against more established languages like FORTRAN, Python, and Julia. The numerical analysis and linear algebra packages in Rust are improving, but they're often less-complete than counterparts elsewhere.

I think the killer application of Rust in this space might be concurrency. Rust makes parallelization much more achievable. Rust's SIMD wrappers are nicer than most languages' equivalents as well, so parallelizing and vectorizing sci-comp workloads is a unique benefit Rust can offer. I'm excited to see what happens in this space.

/// An adaptive 4-point Gauss-Lobatto quadrature routine.
/// Kronrod's Rule is used to reduce the computations needed,
/// while using 7- & 13-point Kronrod quadratures for error checking.
///
/// ## Inputs
/// - f: The integrand (must be integrable on \[a,b\])
/// - a: The lower bound of integration.
/// - b: The upper bound of integration.
///
/// NOTE: The algorithm assumes a < b.
/// ## Outputs
/// An `f64` value containing the definite integral calculated to machine precision.
/// ## Examples
/// ```rust
/// use symmetria::quadrature::gaussian_kronrod_quad;
/// use std::f64::consts;
/// let f = |x: f64| x.cos();
/// let val = gaussian_kronrod_quad(f, 0.0, (consts::PI / 2.0));
/// assert!((1.0 - val).abs() < 1.0e-17);
/// ```
pub fn gaussian_kronrod_quad(func: F, a: f64, b: f64) -> f64
where
    F: Fn(f64) -> f64,
{
    // This algorithm was adopted from the paper
    // "Adaptive Quadrature --- Revisited" by Walter Gander and Walter Gautschi.
    // published in BIT Numerical Mathematics, vol. 40, No. 1, pp. 84-101.

    // Scale the bounds of integration a and b to [-1, 1].
    // This converts
    // \int_a^b f(x)dx to \int_{-1}^1 f(sx + c)sdx
    // using the change of variables
    // x = scale * t + coscale for t ∈ [-1, 1]
    // where
    // scale = (b - a) / 2
    // and
    // coscale = (b + a) / 2.
    let scale = (b - a) / 2.0;
    let coscale = (b + a) / 2.0;
    // Now, f must always be evaluated at scale * x + coscale, not simply x, and the answer multiplied by scale.

    let gl4_weights = [1.0 / 6.0, 5.0 / 6.0];
    let gl4_fnevals = [
        func(scale * -1.0 + coscale),
        func(scale * 1.0 + coscale),
        func(scale * -1.0 / 5.0_f64.sqrt() + coscale),
        func(scale * 1.0 / 5.0_f64.sqrt() + coscale),
    ];
    // Next, use the 4-point Gauss-Lobatto quadrature given by
    // \int_{-1}^1 f(x)dx ≈ \frac{1}{6}[f(-1) + f(1)]
    //                      + \frac{5}{6}[f(-\frac{1}{\sqrt{5}}) + f(\frac{1}{\sqrt{5}})]
    let gl4 = gl4_weights[0] * ((gl4_fnevals[0] + gl4_fnevals[1]) * scale)
        + gl4_weights[1] * ((gl4_fnevals[2] + gl4_fnevals[3]) * scale);

    let kron7_fnevals = [
        func(scale * -((2. / 3.0_f64).sqrt()) + coscale),
        func(scale * (2. / 3.0_f64).sqrt() + coscale),
        func(coscale), // scale * 0.0 is 0.
    ];
    // Then we compare this to its 7-point Kronrod extension, given by
    // \int_{-1}^1 f(x)dx ≈ \frac{11}{210}[f(-1) + f(1)]
    //                      + \frac{72}{245}[f(-\sqrt{\frac{2}{3}}) + f(\sqrt{\frac{2}{3}})]
    //                      + \frac{125}{294}[f(-\frac{1}{\sqrt{5}}) + f(\frac{1}{\sqrt{5}})]
    //                      + \frac{16}{35}[f(0)]
    let kron7 = (11. / 210.) * ((gl4_fnevals[0] + gl4_fnevals[1]) * scale)
        + (72. / 245.) * ((kron7_fnevals[0] + kron7_fnevals[1]) * scale)
        + (125. / 294.) * ((gl4_fnevals[2] + gl4_fnevals[3]) * scale)
        + (16. / 35.) * (kron7_fnevals[2] * scale);

    // Finally, we compare all of them to the 13-point Kronrod extension, given by
    // \int_{-1}^1 f(x)dx ≈ A * [f(-1) + f(1)]
    //                      + B * [f(-x_1) + f(x_1)]
    //                      + C * [f(-\sqrt{\frac{2}{3}}) + f(\sqrt{\frac{2}{3}})]
    //                      + D * [f(-x_2) + f(x_2)]
    //                      + E * [f(-\frac{1}{\sqrt{5}}) + f(\frac{1}{\sqrt{5}})]
    //                      + F * [f(-x_3) + f(x_3)]
    //                      + G * [f(0)]
    // where
    // A = .015827191973480183087169986733305510591,
    // B = .094273840218850045531282505077108171960,
    // C = .15507198733658539625363597980210298680,
    // D = .18882157396018245442000533937297167125,
    // E = .19977340522685852679206802206648840246,
    // F = .22492646533333952701601768799639508076,
    // G = .24261107190140773379964095790325635233,
    // and
    // x_1 = .94288241569547971905635175843185720232,
    // x_2 = .64185334234578130578123554132903188354,
    // x_3 = .23638319966214988028222377349205292599.
    let a_weight = 0.015_827_191_973_480_183;
    let b_weight = 0.094_273_840_218_850_05;
    let c = 0.155_071_987_336_585_4;
    let d = 0.188_821_573_960_182_45;
    let e = 0.199_773_405_226_858_52;
    let f = 0.224_926_465_333_339_51;
    let g = 0.242_611_071_901_407_74;

    let x_1 = 0.942_882_415_695_479_7;
    let x_2 = 0.641_853_342_345_781_3;
    let x_3 = 0.236_383_199_662_149_88;

    let kron13 = a_weight * (gl4_fnevals[0] + gl4_fnevals[1]) * scale
        + b_weight * (func(scale * -x_1 + coscale) + func(scale * x_1 + coscale)) * scale
        + c * (kron7_fnevals[0] + kron7_fnevals[1]) * scale
        + d * (func(scale * -x_2 + coscale) + func(scale * x_2 + coscale)) * scale
        + e * (gl4_fnevals[2] + gl4_fnevals[3]) * scale
        + f * (func(scale * -x_3 + coscale) + func(scale * x_3 + coscale)) * scale
        + g * kron7_fnevals[2] * scale;

    let x = vec![a, b];
    let y = batch_eval(&func, &x);

    gkquad_recursive(&func, a, b, y[0], y[1], kron13)
}

/// A helper function for gaussian_kronrod_quad().
fn gkquad_recursive(f: &F, a: f64, b: f64, fa: f64, fb: f64, iest: f64) -> f64
where
    F: Fn(f64) -> f64,
{
    let h = (b - a) / 2.0;
    let m = (a + b) / 2.0;
    let alpha = f64::sqrt(2.0 / 3.0);
    let beta = 1.0 / f64::sqrt(5.0);

    // Our new evaluation points...
    let mll = m - alpha * h;
    let ml = m - beta * h;
    let mr = m + beta * h;
    let mrr = m + alpha * h;
    let x = vec![mll, ml, m, mr, mrr];
    let y = batch_eval(&f, &x);

    let fmll = y[0];
    let fml = y[1];
    let fm = y[2];
    let fmr = y[3];
    let fmrr = y[4];

    let i2 = (h / 6.0) * (fa + fb + 5.0 * (fml + fmr));
    let i1 =
        (h / 1470.) * (77. * (fa + fb) + 432. * (fmll + fmrr) + 625. * (fml + fmr) + 672. * fm);

    if iest + (i1 - i2) == iest || mll <= a || b <= mrr {
        if m <= a || b <= m {
            // At this point, we have exhausted the machine's precision.
            // We should handle this somehow...
        }
        i1
    } else {
        // Sum of the integrals on the ranges [a, mll], [mll, ml], [ml, m], [m, mr], [mr, mrr], [mrr, b];
        gkquad_recursive(f, a, mll, fa, fmll, iest)
            + gkquad_recursive(f, mll, ml, fmll, fml, iest)
            + gkquad_recursive(f, ml, m, fml, fm, iest)
            + gkquad_recursive(f, m, mr, fm, fmr, iest)
            + gkquad_recursive(f, mr, mrr, fmr, fmrr, iest)
            + gkquad_recursive(f, mrr, b, fmrr, fb, iest)
    }
}

NOTE: The function doesn't initially check whether it ought to evaluate recursively or not, but a simple if-else statement should catch that.

Building a Compiler in Rust – Part 1

Ethan Barry — Fri, 17 May 2024 03:13:30 GMT

Building a lexer & choosing a language.

Not too complicated, right?

At the end of the day, our computers really only understand ones and zeros, the state of logical gates in a piece of finely-engraved silicon, and the layout of charges in chips of memory. Programs, data, noise – they all look the same. Yet, we're able to write instructions in a text-based programming language, that are somehow turned into the bits the computer speaks natively. How does that happen? With a compiled language, like Pascal, C, or Rust, the compiler does the job of turning the language into processor instructions, which are then assembled into bits and stored in special executable formats. Today, I want to explore the deep and beautiful ways in which a programming language can be translated into machine instructions by building a compiler.

What are we compiling?

This is a good question to start with. There are many languages out there – the major, general-purpose programming languages everyone thinks of, and then less common ones, like gnuplot instructions, or sudo configuration, or other Domain-Specific Languages (DSLs). We could write a program to interpret or compile any Turing-complete language ever invented, but most of them are complicated. Instead, we'll choose a simple, classic language that was invented specifically for learning compiler techniques with, PL/0.

PL/0 is an instructional language created by Niklaus Wirth, the same guy who invented Pascal, and contributed so much in the earlier days of computer science. He provided a basic compiler for it in his book Algorithms + Data Structures = Programs, which is a great read if you can find it. He wrote his in Pascal, which is difficult to understand if (like me) you aren't used to Pascal and its lack of abstractions around string processing. I like Dr. Brian Callahan's guide much better – it's written in C, and makes much more sense. I'll be following his guide for most of this series, and I highly recommend checking it out as well, especially when I glaze over details.

PL/0 is a simple language. Unless we extend it, it has one data type, VAR, which is an integer. Here is a simple program, stolen from Wikipedia.

VAR x, squ;

{Square x and load to squ}
PROCEDURE square;
BEGIN
   squ := x * x
END;

BEGIN
   x := 1;
   {Square numbers from 1 to 10}
   WHILE x < 11 DO
   BEGIN
      CALL square;
      x := x + 1
   END
END.

There are not very many nice features here. I haven't even included I/O operations, but we'll deal with that later.

Our first job is to get this string of text into a string of important tokens. We can leave out whitespace and comments (the bits inside the {}). Now, how do we go about turning a string of text from a file into a string of tokens? We need a lexing program.

Building a Lexer

The `main.rs` file:

The first subtask is to make a program that can read a file. Since I want to make a nice, professional-looking program, I'll use the Clap CLI builder. This will let me run my program like so:

pl0rs -f path/to/source_code.pl0 -o path/to/target

Right now, we only need the -f part, for the file we're reading.

use std::path::PathBuf;
use std::process::exit;

use clap::{Parser, Subcommand};

use pl0rs::{self, lexer::lex, read_file, COMPILER_VERSION};

// Borrowed from Clap's boilerplate tutorial...
#[derive(Parser)]
#[command(version, about, long_about = None)]
struct Cli {
    /// Path to the input file
    #[arg(short, long, value_name = "FILE")]
    file: Option,

    /// Turn debugging information on
    #[arg(short, long, action = clap::ArgAction::Count)]
    debug: u8,
}

fn main() {
    let cli = Cli::parse();
    let mut file = String::new();
    let mut state = pl0rs::State::default();

    // Print compiler version, &c.
    println!("pl0rs -- PL/0 Compiler version {}", COMPILER_VERSION);
    println!("(c) Ethan Barry, 2024. GPLv3 licensed.");

    // Open the file and pass it into the file string.
    // Program can terminate here with an error code.
    if let Some(file_path) = &cli.file {
        if let Ok(file_string) = read_file(file_path) {
            file = file_string;
        } else {
            eprintln!("Error: No such file found.");
            exit(1);
        }
    }

    match cli.debug {
        0 => {
            println!("Debug mode is off.");
        }
        1 => {
            println!("Debug mode is on.");
            state.debug = true;
        }
        _ => {
            println!("Don't be crazy. Defaulting to debug mode on.");
            state.debug = true;
        }
    }

    // Call the other modules.
    match lex(&mut state, &file) {
        Ok(res) => {
            println!("Lexer succeeded.");
        }
        Err(e) => {
            {
                eprintln!("{e}");
                exit(1);
            } // A returned lexer error.
        }
    }
}

That's a minimum working example. I also want to talk about project organization for a second. We know that this will be a complex program. It will need to be tested. Unit tests, declared in mod test and annotated with #[test] are not enough to validate an entire compiler. We also need integration tests. An integration test acts like an external application calling your program. It should be able to test all the functionality, by essentially acting as another fn main() in the codebase.

That's why you should organize the project like this:

┌ src
├─ main.rs
├─ lib.rs
├─ lexer.rs
╰ tests
Cargo.toml
...

Your main.rs contains the glue logic that calls the real code in lib.rs. This way, your integration tests, in tests, can also call your library and test it against whatever test battery you create. It's a more advanced way of organizing the project.

The `lib.rs` file:

Now for the library where our compiler's functionality will live. We need to handle some housekeeping first. Let's declare a const VERSION so we can print it out when we want. All the real compilers display version numbers! Let's also declare our first module, the lexer.

//! File lib.rs
use std::{fs::File, io::Read, path::Path, process::exit};

pub mod lexer;

pub const COMPILER_VERSION: &str = "0.1.0";

pub struct State {
    pub debug: bool,
    pub line: u32,
}

impl Default for State {
    fn default() -> Self {
        Self {
            debug: false,
            line: 1,
        }
    }
}

I also created a struct called State. It will store whether we're in debug mode, and what line we're on. It's a good idea to impl Default whenever we create a struct we want to use. I think this trait can also be derived, but this way was more fun.

Next, here's a function to read in a file and return it as a string.

//! File lib.rs
/// Can cause program termination with an error code.
pub fn read_file(filename: &Path) -> Result {
    let path = Path::new(filename);

    // TODO: Eliminate unwrap.
    if !path.extension().unwrap_or_default().eq("pl0") {
        eprintln!("Error: File must have a .pl0 extension.");
        exit(1);
    }

    let mut file = match File::open(path) {
        Ok(file) => file,
        Err(e) => return Err(format!("Couldn't open file: {e}")),
    };

    let mut contents = String::new();
    match file.read_to_string(&mut contents) {
        Ok(_) => Ok(contents),
        Err(e) => Err(format!("Couldn't read file: {e}")),
    }
}

I'm enforcing that the file has a .pl0 extension, but that isn't really necessary, just fun. Once we've read in the file, we need to represent different tokens somehow. Dr. Callahan used #defines in his C program, but I think an enum will work nicely for Rust. The different types of tokens are these:

//! File lib.rs
#[derive(Debug, Clone)]
pub enum Token {
    Ident(String),
    Number(i64),
    Const,
    Var,
    Procedure,
    Call,
    Begin,
    End,
    If,
    Then,
    While,
    Do,
    Odd,
    Dot,
    Equal,
    Comma,
    Semicolon,
    Assign,
    Hash,
    LessThan,
    GreaterThan,
    Plus,
    Minus,
    Multiply,
    Divide,
    LParen,
    RParen,
}

impl PartialEq for Token {
    fn eq(&self, other: &Self) -> bool {
        std::mem::discriminant(self) == std::mem::discriminant(other)
    }
}

impl std::fmt::Display for Token {
    // Needs a better format method.
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        write!(f, "{:?}", self)
    }
}

Each keyword, operator, and symbol has its own unique enum variant. Identifiers and numbers also get one. Here's where you choose your integer size; I chose i64, which means that I'll always try to parse a number in PL/0 source code into a 64-bit signed integer.

Brief funny story: I've done impl Display here by using a debug formatter. This is so I don't have to type {:?} every time I want to print a Token. Originally, I had used {}, which I assumed would just work. However, I kept getting stack overflow errors. I didn't realize it right away, but by calling write!(f, "{}", self) I was recursively calling fmt() and overflowing the stack. Just thought that was entertaining...

The `lexer.rs` file:

We have all the boilerplate. Now, we need to lex everything into the right sort of token. This will mean going through the file, character by character, and deciding what to do on the fly. An iterator would be nice!

Our lexer's primary function will take a string containing our PL/0 code, and a State that we can modify. It should return a Vec, but that might fail, so we'll return a Result, String> instead.

//! File lexer.rs

pub fn lex(state: &mut State, file: &str) -> Result, String> {
    let mut chars = file.chars().peekable();
    let mut comment = String::new();
    let mut tokens: Vec = vec![];

    // TODO!

    Ok(tokens)
}

First, let's process comments. In PL/0, comments are closed with braces, like this:

{this is a comment}

When we see an opening brace, we'll skip through all the characters until we find the closing brace that matches it. If there is none, it's an error. We'll add this after the TODO! comment.

'lexer: loop {
    if let Some(c) = chars.peek() {
        if (*c).eq(&'{') {
            chars.next(); // Consume the opening brace
            'comment: for c in chars.by_ref() {
                if c == '}' {
                    break 'comment;
                }
                if c.eq(&'\n') {
                    state.line += 1;
                }
                comment.push(c);
            }
        }
    } else {
        break 'lexer; // No more characters.
    }
}

There! We simply push all the characters in a comment to the comment string, in case we need them. Also be sure to increment the line count; we'll need that later.

The next trick is to discard whitespace between tokens. We can add some code to handle that in the same way we handle comments. After we've handled whitespace and comments, everything remaining is syntactically significant. A nested set of if-else statements will handle that. The entire lex() function is below.

//! File lexer.rs
pub fn lex(state: &mut State, file: &str) -> Result, String> {
    let mut chars = file.chars().peekable();
    let mut comment = String::new();
    let mut tokens: Vec = vec![];

    'lexer: loop {
        if let Some(c) = chars.peek() {
            if (*c).eq(&'{') {
                chars.next(); // Consume the opening brace
                'comment: for c in chars.by_ref() {
                    if c == '}' {
                        break 'comment;
                    }
                    comment.push(c);
                }
            } else if (*c).is_whitespace() {
                if c.eq(&'\n') {
                    state.line += 1;
                }
                chars.next(); // Consume the whitespace.
            } else if (*c).is_alphabetic() || (*c).eq(&'_') {
                let token = identifier(&mut chars, state)?;
                tokens.push(token);
            } else if (*c).is_numeric() {
                let token = number(&mut chars, state)?;
                tokens.push(token);
            } else if (*c).eq(&':') {
                let token = assignment(&mut chars, state)?;
                tokens.push(token);
            } else {
                let token = match *c {
                    '.' => Token::Dot,
                    '=' => Token::Equal,
                    ',' => Token::Comma,
                    ';' => Token::Semicolon,
                    '#' => Token::Hash,
                    '<' => Token::LessThan,
                    '>' => Token::GreaterThan,
                    '+' => Token::Plus,
                    '-' => Token::Minus,
                    '*' => Token::Multiply,
                    '/' => Token::Divide,
                    '(' => Token::LParen,
                    ')' => Token::RParen,
                    _ => {
                        return Err(format!("Unknown token on line {}", state.line));
                    }
                };
                tokens.push(token);
                chars.next();
            }
        } else {
            break 'lexer; // No more characters
        }
    }

    Ok(tokens)
}

A diagram might make the code easier to understand. I'll draw it as a state machine!

State machine diagram of the lexer. ( Made by the author in Inkscape. )

Any of the last four boxes on the right add a token to the Vec that we're storing the tokenized code in. Also note the assignment() block. It is called when the lexer finds a colon. In the PL/0 grammar, the assignment operator is :=, and the only place a colon can occur is in the assignment operator. So, if we find a colon without an equals sign, it's an error. Let's write that method now.

//! File lexer.rs
pub fn assignment(chars: &mut Peekable>, state: &mut State) -> Result {
    chars.next(); // Consume the ':' character.

    if let Some(c) = chars.peek() {
        if (*c).eq(&'=') {
            chars.next();
            Ok(Token::Assign)
        } else {
            Err(format!("Unknown token ':' on line {}", state.line))
        }
    } else {
        Err(format!("Unterminated assignment on line {}", state.line))
    }
}

This function is pretty straightforward. The if let syntax is a clean way to deal with the iterator. This method advances the iterator right away, because it is only ever called on a ':'. (If it were called somewhere besides a colon, this would be a logic error. You can make a note of that and move on.)

Words and Numbers

Now there are just two functions left to implement, number() and identifier(). These both need to take one character at a time, until they encounter whitespace, or something that isn't legally a number or identifier. Dr. Callahan suggested extending PL/0 to allow breaking up large integers with underscores, like we can do in modern languages. We'll add that as well.

Also notice that the function lex() calls identifier() whenever it sees an alphabetic character or underscore. This includes keywords and variable names. It will have to distinguish them, and return the correct Token enum variant.

//! File lexer.rs
pub fn identifier(chars: &mut Peekable>, state: &mut State) -> Result {
    let mut idt = String::new();

    loop {
        // Push characters onto idt one by one.
        if let Some(c) = chars.peek() {
            if (*c).is_alphanumeric() || (*c).eq(&'_') {
                idt.push(*c);
                chars.next();
            } else {
                break;
            }
        } else {
            // Was a None.
            return Err(format!("Unterminated identifier on line {}", state.line));
        }
    }

    let token = match idt.to_lowercase().as_str() {
        "const" => Token::Const,
        "var" => Token::Var,
        "procedure" => Token::Procedure,
        "call" => Token::Call,
        "begin" => Token::Begin,
        "end" => Token::End,
        "if" => Token::If,
        "then" => Token::Then,
        "while" => Token::While,
        "do" => Token::Do,
        "odd" => Token::Odd,
        _ => Token::Ident(idt),
    };

    Ok(token)
}

pub fn number(chars: &mut Peekable>, state: &mut State) -> Result {
    let mut num = String::new();
    loop {
        if let Some(c) = chars.peek() {
            if (*c).is_numeric() || (*c).eq(&'_') {
                num.push(*c);
                chars.next();
            } else {
                break;
            }
        } else {
            // Was a None.
            return Err(format!("Unterminated number on line {}", state.line));
        }
    }

    if let Ok(res) = num.parse::() {
        Ok(Token::Number(res))
    } else {
        Err(format!("Invalid number at line {}", state.line))
    }
}

Notice that identifier() allows variables with underscores or numbers after an initial letter or underscore, so x_0 and _1000 are legal variable names. number() parses the complete number string into an i64, as we decided earlier.

At the risk of making the article too long for anyone to read, I put the complete lexer.rs file below, as written up to this point.

//! File lexer.rs (complete)
use std::{iter::Peekable, str::Chars};

use crate::{State, Token};

/// ## Preconditions
///
/// - Must only be called for characters which can begin an identifier.
pub fn identifier(chars: &mut Peekable>, state: &mut State) -> Result {
    let mut idt = String::new();

    loop {
        // Push characters onto idt one by one.
        if let Some(c) = chars.peek() {
            if (*c).is_alphanumeric() || (*c).eq(&'_') {
                idt.push(*c);
                chars.next();
            } else {
                break;
            }
        } else {
            // Was a None.
            return Err(format!("Unterminated identifier on line {}", state.line));
        }
    }

    let token = match idt.to_lowercase().as_str() {
        "const" => Token::Const,
        "var" => Token::Var,
        "procedure" => Token::Procedure,
        "call" => Token::Call,
        "begin" => Token::Begin,
        "end" => Token::End,
        "if" => Token::If,
        "then" => Token::Then,
        "while" => Token::While,
        "do" => Token::Do,
        "odd" => Token::Odd,
        _ => Token::Ident(idt),
    };

    Ok(token)
}

pub fn number(chars: &mut Peekable>, state: &mut State) -> Result {
    let mut num = String::new();
    loop {
        if let Some(c) = chars.peek() {
            if (*c).is_numeric() || (*c).eq(&'_') {
                num.push(*c);
                chars.next();
            } else {
                break;
            }
        } else {
            // Was a None.
            return Err(format!("Unterminated number on line {}", state.line));
        }
    }

    if let Ok(res) = num.parse::() {
        Ok(Token::Number(res))
    } else {
        Err(format!("Invalid number at line {}", state.line))
    }
}

pub fn assignment(chars: &mut Peekable>, state: &mut State) -> Result {
    chars.next(); // Consume the ':' character.

    if let Some(c) = chars.peek() {
        if (*c).eq(&'=') {
            chars.next();
            Ok(Token::Assign)
        } else {
            Err(format!("Unknown token ':' on line {}", state.line))
        }
    } else {
        Err(format!("Unterminated assignment on line {}", state.line))
    }
}

pub fn lex(state: &mut State, file: &str) -> Result, String> {
    let mut chars = file.chars().peekable();
    let mut comment = String::new();
    let mut tokens: Vec = vec![];

    'lexer: loop {
        if let Some(c) = chars.peek() {
            if (*c).eq(&'{') {
                chars.next(); // Consume the opening brace
                'comment: for c in chars.by_ref() {
                    if c == '}' {
                        break 'comment;
                    }
                    if c.eq(&'\n') {
                        state.line += 1;
                    }
                    comment.push(c);
                }
            } else if (*c).is_whitespace() {
                if c.eq(&'\n') {
                    state.line += 1;
                }
                chars.next(); // Consume the whitespace.
            } else if (*c).is_alphabetic() || (*c).eq(&'_') {
                let token = identifier(&mut chars, state)?;
                tokens.push(token);
            } else if (*c).is_numeric() {
                let token = number(&mut chars, state)?;
                tokens.push(token);
            } else if (*c).eq(&':') {
                let token = assignment(&mut chars, state)?;
                tokens.push(token);
            } else {
                let token = match *c {
                    '.' => Token::Dot,
                    '=' => Token::Equal,
                    ',' => Token::Comma,
                    ';' => Token::Semicolon,
                    '#' => Token::Hash,
                    '<' => Token::LessThan,
                    '>' => Token::GreaterThan,
                    '+' => Token::Plus,
                    '-' => Token::Minus,
                    '*' => Token::Multiply,
                    '/' => Token::Divide,
                    '(' => Token::LParen,
                    ')' => Token::RParen,
                    _ => {
                        return Err(format!("Unknown token on line {}", state.line));
                    }
                };
                tokens.push(token);
                chars.next();
            }
        } else {
            break 'lexer; // No more characters
        }
    }

    Ok(tokens)
}

We have a fully functional lexer for our compiler, that tokenizes and understands PL/0! Congratulations! Our next steps come in the next post in this series, where we'll parse the Vec that our lexer spits out.

As always, thank you for reading! If you have suggestions or comments, please write me at ethan.barry@howdytx.technology. Have a great day!

Lagrange Interpolation

Ethan Barry — Sat, 16 Mar 2024 16:50:00 GMT

Or, how to avoid doing math...

...by doing different math!

Interpolation is a very common problem. You need to interpolate any time you have:

A table of values, in larger increments than you'd like,
A set of points on some unknown function,
Known values some time apart, and you'd like to know what's in between.

Actually, these are all the same problem. Let's explore some solutions!

Bad Interpolation

There are a lot of techniques for interpolation. 90% of them are guaranteed to be bad, but I think it's a good idea to show why.

In real life, when I'm trying to interpolate something (maybe on the occasions I mess around with a cookbook) I try to "eyeball" it. For the amount of salt in a fractionally-scaled recipe, that's okay. (There is no such thing as too much salt. Fight me.) But for practical applications, we need to do better.

Back in The Old Days™, people had to consult log tables and/or slide rules for values of functions. This still happens today in astronomy, where orbital values come in tables called ephemerides, or almanacs, like these published by the United States Naval Observatory. If you want something between the given values, you need to interpolate.

Linear Interpolation

The simplest (formal) method is to fit the two points nearest your x-value to a line. If we know \(sin(x)\) at two points, say \(x = 0\) & \(x = 2\), then we might interpolate it like this:

Plot of sin(x) and a linear interpolation from 0 to 2.

...which is objectively terrible. It could be better if we had a smaller increment than 0 to 2, but linear interpolation is inaccurate unless the thing it's interpolating is fairly linear itself. The upside is that it's easy to compute.

Smooth(er) Interpolation With Cosine

We can get a smoother curve by using weights on the interpolation function. We need something that produces a curve anywhere we like, so let's choose cosine. Now, given two points, we'll say: \[y_i \approx y_1 (1-\frac{1}{2}(1-cos(\pi \frac{x - x_1}{x_1-x_2})) + y_2 (\frac{1}{2}(1-cos(\pi \frac{x - x_1}{x_1-x_2}))\]

That's a mess. The \(\frac{x - x_1}{x_1-x_2}\) terms are our weight. When \(x\) is equal to \(x_1\), the term is zero, and all the weight is on \(y_2\). Conversely when \(x\) equals \(x_2\), the weight is one, and it's all on \(y_1\). Here's our visual example with the cosine interpolation added.

Plot of sin(x), our linear interpolation, and a cosine-smoothed interpolation, all from 0 to 2.

Notice that all this has done is smooth our linear function. It's still not precise. It would make a nice curve between multiple points, but let's try for something better.

Lagrange Polynomials and Lagrangian Interpolation

Let's be more ambitious. Say we have a set of \(n\) points (you decide how many) and we want to fit a polynomial to them. There will be many polynomials of degree \(> n\) that go through them, but only one of degree \(< n\). Usually, this polynomial's degree is \(n-1\). If the points happen to lie on a lower-degree polynomial, then the degree \(n-1\) polynomial's leading coefficients are zero.

Given our set of points \[(x_1, y_1), (x_2, y_2), ..., (x_i, y_i), ..., (x_n, y_n)\] the polynomial we want is this: \[y = \sum_{i=1}^n y_iL_i(x)\] where \(L_i(x)\) is the \(i\)th Lagrange polynomial, given by: \[L_i(x) = \prod_{j=1,j\ne i}^n\frac{x-x_j}{x_i-x_j}\]

In other words, the first Lagrange polynomial is: \[L_1(x) = \frac{(x-x_2)(x-x_3)...(x-x_n)}{(x_1-x_2)(x_1-x_3)...(x_1-x_n)}\]

We have a polynomial that goes through all the points, and it is also the polynomial of least degree that does so. (That's important for maximum accuracy.) The linked textbook walks through this topic in more detail, but I'll jump straight to my code.

To get a computer to do this for us, we'll first need to pull in our table of values, which is a vector of points in Computer-Science-ese, and evaluate the \(i\)th Lagrange polynomial at the \(x\)-value we want to interpolate, for all \(i\) in \([0,n]\). This just amounts to a couple of for-loops.

I used a janky iterator to exclude \(i\) in the internal loop. (Recall the provision \(j\ne i\) in our Lagrange polynomial definition.) I'm sure there's a better way to do this, but that's all I could come up with at the time...

fn lagrangian_interpolation(table: Table, xval: f64) -> f64 {
    let n = table.len();
    let mut sum = 0.0;

    for i in 0..n {
        if let Some(Point(x_i, y_i)) = table.pt_at(i) {
            // Evaluate the ith Lagrange poly'n. at xval and add it to sum.
            let mut product = 1.0;

            // Iterator grabs all but i.
            for j in (0..n)
                .enumerate()
                .filter(|&(pos, _)| (pos != i))
                .map(|(_, e)| e)
            {
                let x_j = table.pt_at(j).unwrap().0;
                product *= (xval - x_j) / (x_i - x_j);
            } // product is now the value of L_i(xval).

            sum += y_i * product;
        } else {
            break;
        }
    }

    sum
}

I set up my main function to print out a tab-separated list of points, and redirected this to a file. Then, I used gnuplot to plot these points in yellow, so we have a comparison of all our methods.

Same graph, but with our Lagrange interpolated values shown as yellow dots.

Notice that the yellow tracks the actual function in blue very well, until about 1.5 radians. I ran this test with a table of four values, and began interpolating points with roughly \(x = \frac{\pi}{6}\), in increments of \(0.01\), for 1000 iterations. It's extrapolating past my last tabulated value of \((1.5707963, 1.0)\), or \((\frac{\pi}{2}, 1)\), and doing an acceptable job for a little while.

The reader is welcome to try fixing the table so the interpolation is more accurate over the interval. All the Rust and gnuplot source code are in my repo here.

This technique is fairly fast, efficient, and is useful in Gaussian quadrature, as well as other modern applications. I hope you find it interesting, if not useful. Either way, let me know what you thought on GitHub, LinkedIn, or by email to ethan.barry@howdytx.technology!

References

Paul Bourke's classic 1999 post. His blog is pure gold; do yourself a favor and bookmark it.

Interpolation methods

Numerical Methods Guy, 2008:

High order interpolation is a bad idea? – Numerical Methods Guy

Numerical Methods Guy Numerical Methods for the STEM Undergraduateautarkaw

This textbook on Celestial Mechanics, where I first learned about the Lagrange interpolation algorithm:

1.11: Fitting a Polynomial to a Set of Points - Lagrange Polynomials and Lagrange Interpolation

LibretextsLibretexts

Simpson's Rule & More

Ethan Barry — Sat, 24 Feb 2024 21:18:00 GMT

Integral Algorithms for the 21st Century

Aren't they mostly derivative work?

Many problems in calculus textbooks are contrived. In fact, most of them are. In reality, the sorts of integrals people like engineers and applied mathematicians see are not (practically) solvable analytically. You can't use the tricks from Calculus II to antidifferentiate the integrand, because it's simply too complicated. So what do we do?

The Quest for Quadrature

When integrals can't be antidifferentiated, or when it's too time-consuming to do so, the area under the curve can be approximated by a computer. These numerical, rather than analytical, methods are called "quadrature." (There is an algorithm to either symbolically integrate a function, or else show it cannot be expressed in elementary functions, but this is far beyond this post's scope!)

Perhaps the simplest quadrature algorithm is to draw a line between the function's value at the endpoints and calculate the area under it, using simple geometry. This would be both A) "interpolation to a linear function," and B) very inaccurate. But, as always, we can make smaller slices.

Trapezoidal Quadrature

Suppose we take the interval \([a,b]\) and slice it into many smaller pieces, drawing a line between the function values at each. This would give us a series of trapezoids under the curve. It seems like this would nicely approximate the curve, but it turns out that the trapezoidal method is nearly always worse than the technique we originally learned early in calculus for estimating area with sums of rectangles!

Simpson's Rule

So, if we have a gently curved integrand which is too complicated to find an antiderivative for, like this, \(\int_a^b f(x),dx\), and we want to integrate it, then let's try interpolating it to something we can calculate. We'll evaluate the function three times, once at \(a\), then at \(b\), and finally at the midpoint.

This curve can be seen as three points on a piece-wise quadratic. Instead of interpolating to a linear function, we've interpolated to a quadratic, using one extra function evaluation. And the area under this quadratic is approximately: \[\frac{1}{3}(f(a)+4f(c)+f(b))\] where \(c\) is the midpoint of \(a\) and \(b\). Then, to gain accuracy, we can split the interval into \(n\) small sections, and interpolate for each one of those, so that:

\[\int_a^b f(x),dx \approx \sum_{1\leq i\leq n} \frac{x_{i+1} - x_i }{6} (f(x_i) + 4f(\frac{x_i + x_{i+1}}{2}) + f(x_{i+1})) \]

All we need is a function that takes a number of intervals to evaluate (the precision) and the bounds. Since I want to pass in any mathematical operation, I'll also take a closure, which is a way of passing a function as an argument to another function. (Yes, functions are data types too!) Here's my Rust implementation:

fn int_simpson(a: f64, b: f64, n: i32, integrand: F) -> f64
where
    F: Fn(f64) -> f64,
{
    // We want to sum from a to b, in n subintervals, where the width delta_x is (a - b) / n.
    // Each subinterval is [a, a+delta_x] up to b.
    let delta_x = (b - a) / n as f64;
    let mut result = 0.0_f64;
    for i in 1..=n {
        result += delta_x
            * (integrand(a + (i as f64 - 1_f64) * delta_x)
                + 4_f64 * integrand(a - delta_x / 2_f64 + i as f64 * delta_x)
                + integrand(a + i as f64 * delta_x))
            / 6_f64;
    }
    result
}

The complete program can be found here, along with an example of using it, and even asserting that the floating-point result is accurate to a certain precision!

Other Methods

I think it's interesting that we can see the rectangle rule, trapezoid rule, and Simpson's rule all have something in common. They're interpolating to polynomials (zeroeth, first, and second degrees, respectively) and estimating. There are certainly better methods available, especially adaptive quadratures, for functions with large high-order derivatives, but this is a quick-and-dirty result that achieves enough accuracy for most use cases. I hope you enjoyed!

Footnote: I've added LaTeX formatting to this post using the excellent KaTeX JS package. Simply add it to your site header, and it will typeset any LaTeX markup it sees. Instructions are here.

It Ain't Your Grandpappy's Java...

Ethan Barry — Sat, 10 Feb 2024 10:27:00 GMT

Recent Java releases bring a lot of QoL improvements.

Here are a few of the biggest...

Java is the de facto instructional language of many universities. It's widespread, it runs on most platforms, it makes a good case study in OOP concepts, and there are a lot of teachers available.

But in many cases, the Java that's taught in the classroom isn't much like production Java. We learn the oldest syntax, and the oldest design patterns, that accomplish the task. It makes typing Java feel like drudgery compared to a more modern language like Rust or Go. Seriously, who wants to type this mess:

MySpecialClassName myClass = new MySpecialClassName();

to instantiate a variable?

The `var` Keyword

In more modern languages, we have type inference. The compiler can tell what a type a variable holds based on what's assigned to it. But wait, Java actually has this too! Instead we can write this:

var myClass = new MySpecialClassName();

which is much more streamlined. var is a special "reserved type name" that was added in Java 10, to make declarations easier to write. So why don't we learn this in university? Personally, the Java courses I've taken have been Java 7 compatible, or older. Java 7 was released in 2011. A lot has changed since then.

The `Optional` Type

In Rust, there's a special enum included in the compiler's prelude, called Option. It has two variants, Some() and None. The convention is to use it when a function may or may not return something, such as a query to a database.

Java added a similar concept with the Optional in Java 8 (released in 2014). It's a box that might hold a value, or a null pointer. The biggest advantage is that by making this return type explicit, developers have to handle the possibility of a null pointer when it might be returned from a method. Null-checking is simplified. This:

Apple apple = mightBeNull("granny smith");

if (apple != null) {
    System.out.println(apple.toString());
} else {
    System.out.println("apple was null");
}

becomes this (if we use our fancy new type inference):

Optional optApple = mightBeNull("granny smith");

var appleVariety = optApple.orElse("No apple in this box...");

System.out.println(appleVariety); // This is safe now!

Record Classes

A common pattern in Java is using an object to hold related data, usually as it's being moved from a source to a consumer. If you had a database of apples, you might represent each apple as a class, like this:

public class Apple {
    private String variety;
    private float pricePerPound;
    
    public String getVariety() {
        return this.variety;
    }
    
    public float getPrice() {
        return this.pricePerPound;
    }
    
    public Apple(String v, float p) {
        this.variety = v;
        this.pricePerPound = p;
    }
    
    @Override
    public String toString() {
        return "Apple(variety=" + this.getVariety() + ", price=" + this.getPrice();
    }
}

That's a lot of code for one apple! Java 14 added a solution in the form of the record class. With modern Java, our Apple is just:

public record AppleRecord(String variety, float pricePerPound) {}

That's literally it.

It implements getters automatically; just call AppleRecord.variety() to get the variety, for example. By default, the fields are immutable, so a record is instantiated like this:

var apple = new AppleRecord("pink lady", 0.20);

If you need custom logic, you can override these default methods in the record's block. Easy!

Conclusion

So why aren't we being taught these concepts? Is it the result of a one-size-fits-all approach to CS education? Are the instructors even familiar with these later developments in the Java ecosystem? I don't know.

What I do know is that being able to think in terms of Optionals, and reduce boilerplate code, will make Java feel like a whole new language for me!

Dyck Words & Catalan Numbers

Ethan Barry — Wed, 07 Feb 2024 21:02:00 GMT

Computer Science + Math

= Awesome

I recently watched a Numberphile video on the Catalan series, which happens to count lot of different things, from partitions of sets, to divisions of polygons, to unique binary trees of a certain order. The video is fascinating and well worth a watch.

One other thing the Catalan numbers count is the number of possible Dyck words of a given semi-length. A Dyck word is a word in a binary (having two elements) alphabet, where no prefix of the word has more of the second element than the first. That's the complicated definition. The simple definition is a balanced set of parentheses, like so: "()()(())" or "((()))()". This should jump out as interesting to any developer - we use parentheses all day!

So, the Catalan numbers count the Dyck words of a given semi-length. For example, there are 42 Dyck words of semi-length 5, since the fifth Catalan number is 42. But how can we generate them all? And more importantly, how can we do it..... in Rust?

It turns out that defining the Catalan numbers and Dyck words in terms of recursion is easy. But with recursion, there's a limit to how far our program can go. Eventually, the call stack won't be able to hold all our recursive function calls, and our program will terminate with a stack overflow. Even if we use the heap with more advanced features, we will slowly eat into memory and run up against the computer's physical limits. (In my case, this is 24 GiB of DDR3 in a mobile workstation from 2013.)

We need a better algorithm than the recursive solution. We need a perfect algorithm.

...and there is one! It is entirely possible to convert recursive definitions and functions into iterative ones, and someone (Dr. Knuth, it seems) has already done so for this particular problem. After a bit of digging, I was able to find a paper on the subject.

The algorithm goes as close as possible to the metal by representing opening parentheses as a 1 bit, and closing parentheses as a 0 bit. In the abstract you can see the algorithm in C-like pseudo-code. It's simply bitwise integer math. Fast, efficient, loop-less, and branch-less. The only trick is that it requires the "previous" Dyck word to act upon.

You can generate the "next" Dyck word in the sequence by flipping pairs around. Taking the Dyck word of the form "()()()()()()" or "101010101010" to be the least, and the word of form "(((((())))))" or "111111000000" to be the greatest, we can run this algorithm on its output until we reach the greatest Dyck word of the given semi-length. The integer passed in to the algorithm must represent a valid Dyck word, or it's a precondition violation.

The full source of my program is here, but I'll reproduce the Rust port of the algorithm below.

fn next_dyck_word_bin(w: u64) -> u128 {
    // Calculate a, potentially using two's complement for negation.
    let a = w & (!w + 1);

    // Calculate b directly.
    let b = w + a;

    // Calculate c using bitwise XOR.
    let c = w ^ b;

    // Shift c right by 2, ensuring unsigned division.
    let c = (c as u128 / a as u128) >> 2; // Cast to u128 for division.

    // Add 1 to c, handling potential overflow.
    let c = c.wrapping_add(1);

    // Apply mask and bitwise OR with b.
    let mask = 0xaaaa_aaaa_aaaa_aaaa_aaaa_aaaa_aaaa_aaaa; // I think this is long enough...

    // Return the resulting Dyck word as a u128.
    ((c * c - 1) & mask) | b as u128
}

I've run the program for semi-length 20, whose Catalan number is 6,564,120,420. As you can see, the series grows quickly. It ran for 6.5 hours before I terminated it; the output file had reached ~200 GiB and over 4 billion of the 6.5 billion Dyck words, represented as ones and zeroes. But in that time, my memory usage never wavered, and my CPU stayed locked at 100%. It was very interesting to watch in reality what O(1) means - it's an invincible algorithm.

A couple improvements to the rest of the program have been suggested to me; I'll implement them soon. Next, I want to work on some numerical integration techniques, like Simpson's method. This was an exciting peek into the world of combinatorics and theoretical computer science. I'm excited to see more!

HowdyTX

UNIX History & the `dc` Calculator

"...an elegant weapon for a more civilized age."

The History

Now

Creating a LaTeX Template for UT Tyler

Because everyone needs more LaTeX in their life.

The Problem

The Solution

Visible Changes

Changes "Under the Hood"

Conclusion

The Best Possible Quadrature Routine (Within Reason)

Gauss-Lobatto-Kronrod Quadratures

Computing Definite Integrals

Gaussian Quadrature

Adaptive Quadrature

Conclusion

Building a Compiler in Rust – Part 1

Building a lexer & choosing a language.

What are we compiling?

Building a Lexer

The main.rs file:

The lib.rs file:

The lexer.rs file:

Words and Numbers

Lagrange Interpolation

Or, how to avoid doing math...

Bad Interpolation

Linear Interpolation

Smooth(er) Interpolation With Cosine

Lagrange Polynomials and Lagrangian Interpolation

References

Simpson's Rule & More

Integral Algorithms for the 21st Century

The Quest for Quadrature

Trapezoidal Quadrature

Simpson's Rule

Other Methods

It Ain't Your Grandpappy's Java...

Recent Java releases bring a lot of QoL improvements.

The var Keyword

The Optional Type

Record Classes

Conclusion

Dyck Words & Catalan Numbers

Computer Science + Math

The `main.rs` file:

The `lib.rs` file:

The `lexer.rs` file:

The `var` Keyword

The `Optional` Type