data Jumble = I Integer | X Integer | Q (Ratio Integer) | D Double | Z (Double, Double) deriving Eq
Numbers in J
The next major obstacle standing in the way of our J interpreter, is understanding J’s eccentric type system.
J has six kinds of numbers with rules for type promotion. We denote the types with the letters B I X Q D Z, and they represent booleans, 64-bit integers, arbitrary-precision integers (J calls these extended integers), arbitrary-precision rationals, doubles, and complex doubles. We have listed the types in order of promotion.
My impression is that B only exists so that J can perform certain
optimizations, so we’ll combine the B and I types. We define the Jumble
type
to hold a J number. ("Jumble" starts with J, sounds similar to "number",
and is indeed a hodge-podge mix of different things.)
Numbers are displayed in a format specific to J. Extended integers and ints look the same. If possible, J suppresses the denominator of rationals, the fractional part of doubles, and the imaginary part of complex numbers, so for each type there are numbers that look like ints.
In Haskell, more sensibly,
read
is
the inverse of show
.
jShowInt n | n < 0 = '_':jShowInt (-n) | otherwise = show n jShowDouble n | n < 0 = '_':jShowDouble (-n) | '.' `elem` s = reverse $ dropWhile (`elem` "0.") $ reverse s | otherwise = s where s = printf "%.6g" n instance Show Jumble where show (I x) = jShowInt x show (X x) = jShowInt x show (Q x) = jShowInt (numerator x) ++ (if denominator x == 1 then "" else 'r':show (denominator x)) show (D x) = jShowDouble x show (Z (x, y)) = jShowDouble x ++ (if y == 0 then "" else 'j':jShowDouble y)
The following are the rules for type promotion:
pro (I x) (I y) = (I x, I y) pro (X x) (X y) = (X x, X y) pro (Q x) (Q y) = (Q x, Q y) pro (D x) (D y) = (D x, D y) pro (Z x) (Z y) = (Z x, Z y) pro (I x) (X y) = (X x, X y) pro (I x) (Q y) = (Q $ x % 1, Q y) pro (I x) (D y) = (D $ fromIntegral x, D y) pro (I x) (Z y) = (Z (fromIntegral x, 0), Z y) pro (X x) (Q y) = (Q $ x % 1, Q y) pro (X x) (D y) = (D $ fromIntegral x, D y) pro (X x) (Z y) = (Z (fromIntegral x, 0), Z y) pro (Q x) (D y) = (D $ fromRational x, D y) pro (Q x) (Z y) = (Z (fromRational x, 0), Z y) pro (D x) (Z y) = (Z (x, 0), Z y) pro x y = pro y x
When numbers of different types meet, the lesser type is promoted to the greater type.
All elements of a J array must have the same type; if one is promoted, then all must be promoted.
Let’s examine multiplication:
maxint = 2^63 - 1 minint = -2^63 checkOverflow z | z >= minint && z <= maxint = I z | otherwise = D $ fromIntegral z jMul (I x) (I y) = checkOverflow $ x * y jMul (X x) (X y) = X (x * y) jMul (Q x) (Q y) = Q (x * y) jMul (D x) (D y) = D (x * y) jMul (Z (a, b)) (Z (c, d)) = Z (a*c - b*d, a*d + b*c) jMul x y = uncurry jMul $ pro x y
With the I type, when the result overflows a 64-bit int, we promote to a double, possibly losing precision. With the X type, we only promote to double if the other argument is a double.
For a number of type I, the motivation appears to be that the internal representation should never need more than 128 bits, come what may. In fact, the exponentiation operator in J pre-emptively converts int types to doubles so even intermediate results are guaranteed to be small, though it does mean numbers one might expect to have type I will actually have type D.
For numbers of type X, the motivation appears to be to preserve precision as long as possible.
On the one hand, this handles overflow gracefully. All too often
we discover a
fixed-with int is too small long after writing the code.
On the other hand, explicit type conversions prevent bugs.
Functions like fromIntegral
and fromRational
are cumbersome but increase
reliability.
Trying to be clever automatically can backfire: when we need to know what is happening, we must refer to documentation to understand the rules, then hope we can trace them correctly by hand.
Above, we’ve omitted positive and negative infinity, which J denotes by one and two underscores respectively. J seems to have multiple flavours of each infinity because J arrays must have the same type. We’ll see how far we can get without implementing infinities.
Taken Literally
Strange rules govern J number literals:
2x 3 4 2 3 4 2 3 4e0 2 3 4 2x 3 4e0 |ill-formed number 9223372036854775807 9223372036854775807 9223372036854776832 9223372036854775807 9223372036854776833 9.22337e18 --9223372036854775808 9223372036854775807 -_9223372036854775808 9.22337e18
Wading through the documentation, we learn radix notation for J numbers, and a few miscellaneous statements about numbers, but there is insufficient detail to explain the above.
We ignore these oddities. In fact, we’ll start simple and only support 64-bit integer constants.
We don’t serve your type here
There are non-numerical J types: strings, boxes, and gerunds. Boxes can be thought of as pointers. For our purposes, gerunds are boxed strings. In fact, they often behave the same:
((+`(<'-')) @. 1) 1 2 3 _1 _2 _3 ((+`-)) @. 1) 1 2 3 _1 _2 _3
However, somewhat inexplicably:
(+`-) = (+`(<'-')) 1 0
We’ll just equate gerunds with boxed strings, sacrificing more J compatibility.
Concrete nouns
The above suggests representing J nouns the type Shaped Jumble
, where
Shaped
is our shaped array type we developed before, and
Jumble
has been extended to accommodate boxes and strings:
data Jumble = I Integer | X Integer | Q (Ratio Integer) | D Double | Z (Double, Double) | S String | Box (Shaped Jumble) deriving Eq instance Show Jumble where show (S x) = x show (Box x) = "[" ++ show x ++ "]" -- ...
We’re almost ready. We just need to understand reflection in J.