As per #29, it seems that we limit ourselves to lines, quadratic Bezier and cubic Bezier curves. This would change the coordinates distribution and thus gives an opportunity to simplify and optimize the coordinates encoding.
Concrete Analysis
To get a gist of the typical usage, I've analyzed the Material Design Icons as of Templarian/[email protected] (which also includes the Google's icon set). A quick and dirty Python script used is available here.
0 1 2 3 4 5 6

M  22073 2 0 0 0 0 0  22075
L  16774 3547 10394 3325 1668 154 18  35880
C  17939 4987 5525 873 221 46 1  29592
Q  9 66 13 28 19 0 0  135
Z  5614 0 0 0 0 0 0  5614
+
line  18816 3221 8243 1572 701 55 9  32617
+
62409 8602 15932 4226 1908 200 19  93296
This table shows the number of operations and (run length  1) in bits. So 0 corresponds to a run of 1, 1 to a run of 2, 2 to runs of 34, 3 to runs of 58 and so on. All shapes have been normalized to one of M, L, C, Q and Z; arcs in particular have been converted to cubic Bezier curves with the same algorithm as the current C version of IconVG.
It seems that there are significant gaps between 23 bits and 45 bits. This is of course a characteristic of this particular icon set, but it seems that we can safely limit the max run length to 16 with a minimal size increase (<0.3% in this case, theoretically 6.25% max).
The "line" row is a subset of L where only perpendicular lines are allowed. This was to see if having separate V
or H
opcodes with a run is desirable; it seems not, given that a vast majority is a run of just one. It however shows that most lines are perpendicular, suggesting a percoordinate flag.
rel=0 rel=2 rel=3 rel=4 rel=5 rel=6 no rel

abs=0  477 2 51 32 14 10 167  753
abs=1  2068 164 283 174 90 68 737  3584
abs=2  17969 516 3415 1192 1035 933 6386  31446
abs=3  26993 1290 6440 5666 4693 1025 13364  59471
abs=4  53465 3088 15459 12100 7933 4 29965  122014
abs=5  55161 2272 10986 5645 7161 4601 23959  109785
abs=6  0 0 0 0 0 1 0  1
no abs  91903 2418 2467 1134 740 39 174819  273520
+
248036 9750 39101 25943 21666 6681 249397  600574
This table shows the frequencies of coordinates that can be safely represented in onebyte absolute or relative encoding. I've made tons of assumptions:

No further quantization has been performed. If the number is not close enough (±0.01) to the nearest integer it was assumed to not fit in one byte at all.

The icon set has a fixed view box of (0,0)(24,24). This means that absolute integral coordinates almost always fit in 5 bits. To be fair I think this small view box does model the postquantization distribution. I've also tested with every coordinate multiplied by 210, 16 and 100; the entire table is too large to include here though.

There is no relative encoding with a single bit, since it will entirely consist of a sign bit. For more bits 2's complement representation was assumed.

The relative encoding always refers to the last coordinates. This means that, say, if there are 2 coordinates per one operation then the second coordinates are encoded relative to the first coordinates. This is different from SVG but seems reasonable especially given that we no longer has arcs (which center points significantly deviate from the path). Just to be sure I've also tested with SVGlike reference coordinates and found them to be much worse.

The initial coordinates used for the relative encoding are (0,0). There might be a better default or even a general way to maximally use available onebyte space, but such improvements cannot harm this baseline encoding.

We don't care about the exact encoding (would it be 2 bytes quantized or 4 bytes IEEE 754?) if the number doesn't fit to one byte in any encoding. I do think it should be quantized to 2 bytes in that case.
There are two kinds of encodings imaginable with this table.
 Use a separate flag, allocate A bits to absolute and B bits to relative. This would use 1+max(A,B) bits; the asymmetric A != B cases can be used to stuff other encodings.
 Allocate A bits to absolute, but switches to relative if the encoded number is less than 2^B. This would use A bits; both the absoluteonly encoding (B = 0) and the relativeonly encoding (B = A) are included.
In addition to this, it is possible that coordinates are multiplied by some scaling factor, possibly differently for absolute and relative encodings. The shared scaling factor models the optimization performed by authoring tools, so it doesn't need to be a part of the encoding itself. The ratio between absolute and relative encodings would have to be fixed or somehow encoded however.
I've simulated both encodings with all possible parameters and collected the best and runnerups within 10% of the best (plus a selected few indicated with *
). The first number is the number of required multibyte encodings, so lower is better.
4 BITS:
236518 4 bits 1x absolute
246139 4 bits 1x absolute, 2 bits reinterpreted as 1x relative
253685 4 bits 1x absolute, 3 bits reinterpreted as 1x relative
268101* 1 bit flag, 3 bits 1x absolute or 3 bits 1x relative
5 BITS:
181894 5 bits 1x absolute
193787 5 bits 1x absolute, 2 bits reinterpreted as 1x relative
211320* 1 bit flag, 4 bits 1x absolute or 4 bits 1x relative
6 BITS:
161368 6 bits 2x absolute, 2 bits reinterpreted as 1x relative
161855 6 bits 2x absolute
169863 6 bits 2x absolute, 3 bits reinterpreted as 1x relative
174859 1 bit flag, 5 bits 1x absolute or 5 bits 1x relative
175599 1 bit flag, 5 bits 1x absolute or 4 bits 1x relative
176733 1 bit flag, 5 bits 1x absolute or 3 bits 1x relative
181893* 6 bits 1x absolute
7 BITS:
156161 7 bits 4x absolute, 3 bits reinterpreted as 1x relative
156467 7 bits 4x absolute, 2 bits reinterpreted as 1x relative
157162 1 bit flag, 6 bits 2x absolute or 6 bits 1x relative
157174 1 bit flag, 6 bits 2x absolute or 5 bits 1x relative
157506 1 bit flag, 6 bits 2x absolute or 4 bits 1x relative
158129 1 bit flag, 6 bits 2x absolute or 3 bits 1x relative
158312 7 bits 4x absolute
159624 1 bit flag, 6 bits 2x absolute or 2 bits 1x relative
161367 7 bits 2x absolute, 2 bits reinterpreted as 1x relative
161579 1 bit flag, 6 bits 2x absolute or 0 bits 1x relative
161854 7 bits 2x absolute
164619 7 bits 4x absolute, 4 bits reinterpreted as 1x relative
169862 7 bits 2x absolute, 3 bits reinterpreted as 1x relative
174819* 1 bit flag, 5 bits 1x absolute or 6 bits 1x relative
181893* 7 bits 1x absolute
8 BITS:
129256 8 bits 10x absolute, 3 bits reinterpreted as 1x relative
129903 8 bits 10x absolute, 4 bits reinterpreted as 1x relative
130264 8 bits 10x absolute, 2 bits reinterpreted as 1x relative
131482 8 bits 10x absolute
138738 8 bits 10x absolute, 5 bits reinterpreted as 1x relative
152645* 1 bit flag, 7 bits 4x absolute or 7 bits 100x relative
174819* 1 bit flag, 5 bits 1x absolute or 7 bits 1x relative
181893* 8 bits 1x absolute
At least for this particular icon set, both encodings seem to perform relatively well, especially with scaling. The first kind of encodings does leave additional opcode space though so they are preferred. It also seems that 2x and 10x scaling factors are common, where the latter is perhaps an artifact from the decimal encoding of SVG.
It should be also noted that there are disproportionally many coordinates that are exactly equal to the previous (i.e. rel=0
). This is, of course, the original rationale of H/V opcodes. Any efficient encoding should take care of that.
Concrete Proposal
There would be five more sets of opcodes in the combined stylingdrawing mode:
 16
L
opcodes, corresponding to 116 line segments.
 16
C
opcodes, corresponding to 116 cubic Bezier segments (thus 3x coordinates).
 16
Q
opcodes, corresponding to 116 quadratic Bezier segments (thus 2x coordinates).
 one
Z
opcode, closing the current path. If there is no current path this is a nop.
 one
ZM
opcode, closing the current path and move to the following coordinates.
Alternatively, if the opcode space is sufficient the following opcodes might be possible as well:
 16
L
, C
and Q
opcodes as above.
 16
ML
, MC
and MQ
opcodes. They have an additional pair of coordinates to move before other coordinates.
 one
Z
opcode.
In any case coordinates always come in pairs, in the order of x then y. Each coordinate is encoded as follows.
< LSB MSB >
++++
1Z number  onebyte absolute
++++
+++++
01Z delta  onebyte relative
+++++
+++++++
001Z number  twobyte absolute
+++++++
++++++++++
000 mantissa  exponent S fourbyte absolute
++++++++++
Onebyte absolute encodes an absolute integer from 32 to 31. The actual number encoded is (integer + 32).
Onebyte relative encodes an integral difference from 16 to 15. The actual number encoded is (difference + 16). The reference coordinate for x is always the previous x decoded, even in the previous operations or operations with multiple coordinates pairs (unlike SVG path); same for y. The reference coordinates for the first coordinates are (0, 0).
Note: This particular encoding corresponds to "1 bit flag, 6 bits 2x absolute or 4 bits 1x relative" above. (Due to the scaling difference, the specified encoding of "1 bit flag, 6 bits 2x absolute or 5 bits 2x relative" is no worse than it.)
Note: I've also experimented with an opportunistic absolute encoding, where if the absolute range and relative range completely overlaps the relative range is reinterpreted as a distinct absolute range. This sadly had no effect for this particular icon set. Moreover reference coordinates have to be an exact integer for this to work, and it is very hard to ensure that this is repeatable. It is possible, but the resulting complexity didn't justify itself.
Twobyte absolute encodes an absolute number from 64 to (64  1/64) in little endian. The actual number encoded is (number + 64) * 64.
Fourbyte absolute encodes an IEEE 754 binary32 number in little endian. The encoded number has only 20 bits of coded mantissa (compared to 23 in the original); the lowest 3 bits are fixed to zero, allowing a direct conversion from the wire format to the number (#33). I personally don't want this at all, but the format's flexible view box demands this unfortunate addition.
A flag bit Z is in the lowest position of the decoded integer and, if set, indicates that next coordinates use a compressed relative zero and only one of two coordinates are encoded. Z is assumed to be unset in the fourbyte absolute encoding. (As mentioned above, I expect fourbyte encoding to be very uncommon anyway.)
In the initial state, Z in the x coordinate (hereafter ZX) and Z in the y coordinate (hereafter ZY) are interpreted as follows:
 ZX=0, ZY=0: Both x and y are encoded for next coordinates.
 ZX=1, ZY=0: Only y is encoded for next coordinates. The x coordinate is assumed to be same to the previous (reference) x coordinate.
 ZX=0, ZY=1: Only x is encoded for next coordinates. The y coordinate is assumed to be same to the previous (reference) y coordinate.
 ZX=1, ZY=1: Reserved. I currently don't have any good idea for this case.
In the state where the only single coordinate is encoded, Z is interpreted as follows:
 Z=0: Both x and y are encoded for next coordinates.
 Z=1: Different coordinate is encoded for next coordinates. In the other words, if x was encoded this time, Z=1 indicates that only y would be encoded next time, and so on. It is highly unlikely that the same component is repeatedly zero (only possible with some Bezier curves), so this encoding optimizes for the much more common case.
It is invalid that the very last coordinates in the run have any Z bit set. (This can be changed to affect the next run for optimization, but I'm not sure it's worth.)
It is equally valid to make this in the other direction, Z=1 indicating previous coordinates had a zero. This however was more difficult to specify and performed slightly worse than the current proposal (70.5% vs. 70.0% of the original).