Fast, idiomatic C# implementation of Flatbuffers
FlatSharp 7.6.0 is a small feature release that adds a couple of quality of life features:
New Match
methods on Union types that accept delegates. These aren't the most performant but work well for one-offs or cases where you don't want to implement a Visitor.
Optional support for file
visibility on types in FlatSharp-generated code. This reduces the clutter you'll see in your code from FlatSharp. Since file visibility is only supported on C# 11 and above, you will need to opt into this behavior:
<PropertyGroup>
<FlatSharpFileVisibility>true</FlatSharpFileVisibility>
</PropertyGroup>
From the command line: dotnet FlatSharp.Compiler.dll --file-visibility
In addition to these features, FlatSharp's primary unit tests now all run in NativeAOT mode as well as JIT mode thanks to the new NativeAOT-compatible MSTest runner.
Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/7.5.1...7.6.0
FlatSharp 7.5.1 is a minor release that contains important fixes for NativeAOT support.
IInputBuffer
and ISpanWriter
that led to the issues with AOT.Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/7.5.0...7.5.1
FlatSharp 7.5.0 is a medium-sized release with a few changes that may be significant for you. It's now published on Nuget.org. There are no breaking changes.
The primary focus of this release is to significantly reduce the size of the x64 assembly produced by the JITer. There are 3 ways that this is accomplished:
String serialization is no longer inlined. This reduces code size substantially but does have a modest impact on serialization speed. There are other changes that offset most of the performance loss. However, FlatSharp should play much more nicely with your instruction cache now.
Using ThrowHelper
-style methods for throwing exceptions from hot paths.
Remove most checked
arithmetic. FlatSharp already uses safe methods for interacting with memory. The only checked
operations that remain are multiplications and left shifts. This removes many branch instructions and further compacts the generated assembly. If you have a security need to retain checked
arithmetic everywhere, please consider compiling with <CheckForOverflowUnderflow>true</CheckForOverflowUnderflow>
in your csproj
.
How much of a difference does this make? Let's look at a contrived example:
table MailingAddress (fs_serializer)
{
to : string;
street : string;
city : string;
zip_code : int;
}
Examining the bytes of code generated to serialize this results in:
Version | Performance | Table Serialize Bytes | String Serialize Bytes | Total |
---|---|---|---|---|
7.4.0 | 60ns | 3888 (inlined) | 0 | 3888 |
7.5.0 | 58ns | 988 (method calls) | 550 | 1538 |
This effect will scale for each string property in your schema, so while this example is contrived, the benefit should be large for applications where strings are a common data type.
Note: The focus is on shrinking the size of the code generated by the JIT. The generated C# is largely unchanged.
There are some other changes as well:
[DebuggerTypeProxy]
attribute to all generated classes. This ensures that much of the internal FlatSharp state is excluded from debugging views and makes the debugging experience more seamless.Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/7.4.0...7.5.0
FlatSharp 7.4.0 is available on Nuget with a couple of nonbreaking changes:
FlatSharp.Runtime
supports .NET 8FlatSharp.Compiler
now requires any of .NET 8, .NET 7, or .NET 6 to be installed (previously it supported only .NET 6).FlatSharp
was declared as a non-top-level namespace, such as Foo.Bar.FlatSharp
.Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/7.2.3...7.4.0
FlatSharp 7.2.3 is a small release that fixes a few issues:
7.2.0 is a minor feature release that resolves a few bugs and adds a small number of enhancements to the FlatSharp compiler package.
You can now select deserializers FlatSharp should generate. This allows some degree of control over code size from FlatSharp:
<PropertyGroup>
<FlatSharpDeserializers>Lazy;Greedy</FlatSharpDeserializers>
</PropertyGroup>
Using the command line, you can pass --deserializers Progressive;GreedyMutable
In the same vein, FlatSharp can be configured to only emit class definitions. This allows sharing a set of common schemas between projects.
<PropertyGroup>
<FlatSharpClassDefinitionsOnly>true</FlatSharpClassDefinitionsOnly>
</PropertyGroup>
Using the command line, you can pass --class-definitions-only
.
Normally, FlatSharp will process all FBS files in the recursive graph, including include
references that were not explicitly passed to FlatSharp. With this switch, FlatSharp can be configured to only emit output for explicitly-passed input files.
<PropertyGroup>
<FlatSharpInputFilesOnly>true</FlatSharpInputFilesOnly>
</PropertyGroup>
Using the command line, you can pass --input-files-only
.
7.2.0 includes a small set of quality-of-life enhancements:
[DebuggerDisplay]
attributes to make using the debugger a little less cumbersome.7.1.1 is a bugfix release to partially address #372. Negative default values should now work correctly.
FlatSharp 7.1 is an important update to 7.0 that brings several notable improvements! Thanks to @joncham and @bangfalse for their contributions to this release.
Thanks to @joncham of Unity3D, FlatSharp now supports Unity's NativeArray
as a built-in vector type. To enable this functionality, pass the --unity-assembly-path
argument with the value of the path to UnityEngine.dll
to the FlatSharp Compiler. This enables the new vector type UnityNativeArray
:
table MyTable
{
Values : [ Vec3 ] (fs_vector:"UnityNativeArray");
}
One of the big improvements in FlatSharp 7.0 was vector performance. I even spent a lot of time bragging about it in the release notes! And I wasn't wrong.... when using default settings. FlatSharp 7.0 improved vector performance by almost 25% relative to 6.3.5, which is indeed a nice improvement. However, what I didn't observe was that FlatSharp 7.0 actually regressed by 32% relative to 6.3.5 when PGO was turned on!
Type | <--- | Default | ---> | <--- | PGO | ---> | |
---|---|---|---|---|---|---|---|
6.3.5 | 7.0.2 | 7.1.0 | 6.3.5 | 7.0.2 | 7.1.0 | ||
Lazy | Ref | 281ns | 184ns | 172ns | 111ns | 118ns | 115ns |
Lazy | Value | 172 | 81 | 81 | 51 | 56 | 51 |
Progressive | Ref | 472 | 355 | 232 | 283 | 298 | 195 |
Progressive | Value | 156 | 136 | 124 | 83 | 127 | 83 |
Greedy | Ref | 416 | 301 | 214 | 253 | 373 | 174 |
Greedy | Value | 175 | 139 | 99 | 91 | 112 | 74 |
GreedyMutable | Ref | 404 | 297 | 268 | 232 | 367 | 211 |
GreedyMutable | Value | 151 | 141 | 110 | 78 | 113 | 75 |
Relative Perf | 100% | 73 % | 58% | 53% | 70% | 44% |
The table above shows all the permutations of FlatSharp version, Reference/Value type, Deserialization mode, and PGO On/Off. The tl;dr of the data is that FlatSharp 7.1 is the fastest FlatSharp ever, with PGO on or off.
So, what changed?
Up until version 7.1, FlatSharp vectors have all included a generic class hosted in FlatSharp.Runtime
that looked something like this:
public sealed class FlatBufferVector<T, TInputBuffer> : IList<T> where TInputBuffer : IInputBuffer { ... }
While C# generics accomplish many of the same roles as C++ templates, they are fundamentally different beasts. FlatSharp 7.1 emits unique class definitions for each vector much like how the C++ compiler emits unique instances for each template combination:
public sealed class VectorString<TInputBuffer> : IList<string> where TInputBuffer : IInputBuffer { ... }
public sealed class VectorInt<TInputBuffer> : IList<int> where TInputBuffer : IInputBuffer { ... }
An astute reader will notice that the largest improvements in the table above were for reference types instead of value types. This is not a coincidence. When using generic methods and classes, the JIT is very lazy and will often share implementations of methods with reference type generic arguments. This necessarily adds overhead for virtual indirection, which slows things down. By using entirely separate classes, we make the JIT to generate vector code that is tailored to each derived type, instead of the base type. PGO covers some of this up with devirtualization, which is why the gains are a bit less impressive there.
Finally, please note that these benchmarks are designed to stress the vector handling aspects of FlatSharp, so this does not mean that your overall performance will improve by these numbers when using version 7.1 unless your schema is trivially simple.
FlatSharp is used in production in real-life applications, which is great. Please tell your friends! However, as a mostly-solo project, this means that releasing a new version is sometimes a little stressful, especially in a version like 7.1 where the entire Vector stack is being rewritten.
The combination of "used in production" and "solo developer" is a tricky line to walk, which is why I've held project is held to such a high bar (95%) on code coverage. There's always room to do better, and FlatSharp 7.1 address two major test gaps that should further increase the quality of the project.
The first is a dedicated test to ensure that generated code will build with C# language version 8, which prevents accidental uses of new()
, switch expressions, or other fancy new language features from creeping into the generated code. This has caused a few bugs in the past because FlatSharp itself uses C# 11 features, so catching these accidents with automation is a win.
The second new automated test suite uses Stryker Mutator, which is a tool to measure the effectiveness of tests by injecting bugs into code and seeing which are not caught by the tests. Essentially, Stryker adds a bug into the source code, and if no tests fail, then that code clearly isn't tested. After all, tests that don't capture any bugs aren't very useful. While this sounds great (and it is!), the implementation is a little tricky with FlatSharp, since it is code to generate code. For example, FlatSharp might contain something like
public string GetAddMethod() => return "public void Add(int a, int b) { return a + b; }";
Stryker might perform this mutation
public string GetAddMethod() => return string.Empty;
While this makes a lot of sense for normal code, it's not very useful for FlatSharp. What we really want Stryker to do is to mutate the code inside the string so a + b
changes to a - b
. The solution, of course, is to run Stryker on the output of FlatSharp, not FlatSharp itself.
The FlatSharp Stryker tests exhaustively test the code FlatSharp emits for a single, relatively broad FlatBuffer schema. The unit tests do the opposite and cover every scenario across a ton of tiny schemas. It isn't practical to cover everything using Stryker, but the new tests exercise a wide range of FlatSharp's functionality. While not a substitute for unit tests, Stryker testing it is a great complement since it ensures FlatSharp's test suite is resilient to a wide variety of incidental changes.
And finally, despite spending the last few paragraphs talking about FlatSharp's previous test gaps, the new tests have identified no breaking bugs. They've exposed plenty of small things about code that could be better factored to be more testable and some small problems like inconsistent Exception types depending on deserialization mode. But overall, FlatSharp is in an excellent place from a quality standpoint.
Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/7.0.2...7.1.0
7.0.2 is a hotfix release that addresses an issue where FlatSharp's generated code wouldn't compile for C# 8.
Welcome to FlatSharp 7! Version 7 is a major release that includes many under-the-hood improvements such as .NET 7 support, performance improvements for several cases, generated code deduplication, experimental object pools, and some big breaking changes. This is a long list of changes, so buckle up!
Let's start with the bad news: Reflection-only Runtime mode is not making the jump to FlatSharp 7.
The original version of FlatSharp did not include a compiler or support for FBS files. Instead, it used attribute-based reflection like so many other .NET serializers do with attributes like [FlatBufferTable]
which allowed generation of code at runtime. FlatSharp.Compiler was introduced in early 2020, and in version 6 it was ported to use flatc
as its parser instead of the custom grammar. Today, there is really no reason to use reflection mode any longer. The FlatSharp compiler is mature and enables all of the same semantics, with a slew of additional benefits over runtime mode:
flatc
and flatcc
.For these reasons, the trend has been that most new features and development of FlatSharp use the compiler rather than the reflection-only mode. So in version 7, FlatSharp is dropping support for runtime code generation. This is a difficult decision, but one that allows the project to keep moving forward and is broadly aligned to most customer's usage of FlatSharp, along with the bigger trends in .NET, such as AOT and source generators. Going forward, only two packages will see new versions published: FlatSharp.Runtime
and FlatSharp.Compiler
.
Removing Runtime-only mode removes an entire class of human error from FlatSharp and makes AOT easier to reason about. One immediate benefit of this change is that the FlatSharp.Runtime
package now has no internal reflection calls any longer, which makes AOT less error-prone. Unfortunately, it does also remove support for some features such as Type Facades.
fs_nonVirtual
DeprecationIn the interest of helping engineers make good decisions about how to use FlatSharp, Array vectors (fs_vector:"Array"
) and non-virtual properties have been deprecated. These behave very unpredictably with FlatSharp, because...
Lazy
deserialization. Accessing an Array property on a Lazy
object simply allocates a new array each time. Not good!Lazy
, Greedy
, and Progressive
modesfs_writeThrough
)IList<T>
, provide much of the performance, with an order of magnitude more flexibility.When serializing Vectors, FlatSharp does still attempt to devirtualize IList<T>
and IReadOnlyList<T>
to arrays for the performance boost, but this is more an implementation detail of the handling for IList<T>
than it is about continuing to support arrays explicitly.
FlatSharp version 7 supports .NET 7. What a happy coincidence! This is not a major change. However, there are a few things worth calling out.
The first is that IFlatBufferSerializable<T>
has a few new members when using .NET 7:
public interface IFlatBufferSerializable<T> where T : class
{
ISerializer<T> Serializer { get; } // this already existed
#if NET7_0_OR_GREATER
static abstract ISerializer<T> LazySerializer { get; }
static abstract ISerializer<T> ProgressiveSerializer { get; }
static abstract ISerializer<T> GreedySerializer { get; }
static abstract ISerializer<T> GreedyMutableSerializer { get; }
#endif
}
This allows writing code like this:
public static T Parse<T>(byte[] buffer) where T : class, IFlatBufferSerializable<T>
{
// All serializer types are available now, as well
return T.LazySerializer.Parse(buffer);
}
Next, FlatSharp 7 supports required
members. That is, when you specify the required
FBS attribute, the generated C# also contains that annotation:
table RequiredTable
{
Numbers : [ int ] (required);
}
Now generates:
public class RequiredTable
{
public required virtual IList<int> Numbers { get; set; }
}
This is, of course, only available for those of you using .NET 7. If you aren't ready for .NET 7 just yet, then no problem; FlatSharp will continue to generate code that works with .NET 6, .NET Standard 2.0, and .NET Standard 2.1, though you may see #if NET7_0_OR_GREATER
peppered throughout your generated code now!
Finally, FlatSharp 7 fully supports .NET 7 AOT. Effort has been made to remove the last vestiges of reflection from the FlatSharp.Runtime
package.
Previous versions of FlatSharp generated separate code for each root type. That is, imagine this schema:
// A big table
table Common
{
A : string;
B : string;
...
Z : string;
}
table Outer1 (fs_serializer) { C : Common }
table Outer2 (fs_serializer) { C : Common }
In this case, FlatSharp 6 and below would generate 2 full serializers and parsers for Common
, since it is used by both Outer1
and Outer2
. This led to cases where commonly-used large tables would have more than one serializer implementation generated. Such code duplication leads to poor cache performance, poor utilization of the branch predictor, and an explosion of code when the type in question is used in multiple places, such as gRPC definitions.
FlatSharp 7 does the expected thing and generates only one serializer/parser for each type. As part of this change, FlatSharp 7 emits all serializer types, which comes with the happy accident of allowing you to specify it at runtime:
ISerializer<T> lazySerializer = Outer1.Serializer.WithSettings(settings => settings.UseLazyDeserialization());
FlatSharp 7 improves the performance of IList<T>
vectors by enabling devirtualization of internal method calls. Previous versions of FlatSharp defined the base vector along these lines:
public abstract class BaseVector<T> : IList<T>
{
public T this[int index]
{
get => this.ItemAtIndex(index);
}
protected abstract T ItemAtIndex(int index);
}
Virtual methods do have a cost, because the assembly must look first to the vtable to then jump to the actual methods. A better way to write the same code is with this technique:
public interface IVectorAccessor<T>
{
T GetItemAtIndex(int index);
}
public class Vector<T, TVectorAccessor>
where TVectorAccessor : struct, IVectorAccessor<T>
{
private readonly TVectorAccessor accessor;
public T this[int index]
{
get => this.accessor.ItemAtIndex(index);
}
}
But wait! Aren't these the same thing? Both are calling a virtual method after all. The trick is subtle, and involves generics and structs. Stephen Cleary writes about it more clearly than I can here.
This technique is not new to FlatSharp. Support for this trick was added in Version 4, and why IInputBuffer
implementations have been structs and the entire Parsing stack is templatized. However, the opportunity to do the same for vectors was only recently discovered, and the improvements are impressive. Iterating through a Lazy
vector is often 20-50% faster.
Additionally, VTable parsing has been improved by several whole nanoseconds! Joking aside, this is an important thing since every table has a VTable, so this is one of the operations that FlatSharp does at the very core, and the benefit multiplies with the number of tables you read. Only tables with 8 or fewer fields benefit from this optimization. Larger tables fall back to the previous behavior.
Here's a quick teaser of FlatSharp parse/traverse performance of a vector of structs, both reference and value:
Method | Mean | Error | StdDev | P25 | P95 | Code Size | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
Parse_Vector_Of_ReferenceStruct_FlatSharp7 | 247.95 ns | 22.909 ns | 8.169 ns | 242.53 ns | 258.21 ns | 291 B | 0.0787 | 1320 B |
Parse_Vector_Of_ReferenceStruct_FlatSharp6 | 324.56 ns | 4.396 ns | 1.568 ns | 323.32 ns | 326.39 ns | 263 B | 0.0787 | 1320 B |
Parse_Vector_Of_ValueStruct_FlatSharp7 | 93.86 ns | 1.254 ns | 0.447 ns | 93.71 ns | 94.29 ns | 279 B | 0.0072 | 120 B |
Parse_Vector_Of_ValueStruct_FlatSharp6 | 194.78 ns | 8.377 ns | 2.987 ns | 192.61 ns | 197.42 ns | 251 B | 0.0072 | 120 B |
Object Pooling is a technique that can be used to reduce allocations by returning them to a pool and re-initializing them later. FlatSharp 7's object pool is experimental. The intent of this release is to get the feature into the wild and how well it works. With Object Pooling enabled, it is possible to use FlatSharp in true zero-allocation mode. The wiki has full details.
In version 6.2.0, FlatSharp introduced an optional switch to normalize snake_case
fields into UpperPascalCase
. This was off by default in version 6 for compatibility reasons. In FlatSharp 7, field name normalization is on by default. There are three ways to disable this:
Add this to your csproj
file:
<PropertyGroup>
<FlatSharpNameNormalization>false</FlatSharpNameNormalization>
</PropertyGroup>
Annotate your tables and structs with fs_literalName
:
table MyTable (fs_literalName)
{
wont_be_normalized : string;
wont_be_normalized_either : string;
}
struct MyStruct
{
will_be_normalized : int;
wont_be_normalized : int (fs_literalName);
}
Finally, if you use the FlatSharp compiler as a command line tool, you can pass --normalize-field-names false
as a command line argument.
FlatSharp 7 adds a new annotation (fs_unsafeExternal
) that may be applied to value structs and enums. The annotation indicates to FlatSharp that the type is defined externally to the FBS Schema. Serializers are generated based on the definition, but no type definitions are emitted. This allows doing things like referencing hardware accelerated types such as System.Numerics.Vector3
from a FlatSharp schema:
struct Vec3 (fs_valueStruct, fs_unsafeExternal:"System.Numerics.Vector3")
{
x : float32;
y : float32;
z : float32;
}
table Path { Points : [ Vec3 ]; }
In this example, the Points
property in the Path
table will be of type IList<System.Numerics.Vector3>
. There are a few catches to this that you need to be aware of. The first is that it's not safe. FlatSharp is able to provide exactly two validations:
There are many additional things beyond the size of the struct and the endianness of the machine that FlatSharp cannot validate with external structs:
System.Numerics.Vector<byte>
. Vector<byte>
is 32 bytes when running under JIT on a machine supporting AVX2. In the future it maybe 64 bytes on an AVX512 machine. Or it might be 16 bytes if using .NET 7 AOT compilation.flatc
or flatcc
.External structs can be a powerful tool, but they are risky and need thorough testing to ensure correct behavior. If in doubt, it is advisable to simply extend the partial struct declarations and add an implicit conversion operator to the value struct in question.
In other Unsafe news, FlatSharp 7 supports unsafe unions. Unsafe unions are optimizations that apply to unions consisting only of value types. Imagine this FBS schema:
attribute "fs_valueStruct";
attribute "fs_unsafeUnion";
struct FourBytes (fs_valueStruct) { x : int; }
struct EightBytes (fs_valueStruct) { x : long; }
struct SixteenBytes (fs_valueStruct) { x : [ long : 2 ]; }
union MyUnion { FourBytes, EightBytes, SixteenBytes }
union MyUnsafeUnion (fs_unsafeUnion) { FourBytes, EightBytes, SixteenBytes }
Previous versions of FlatSharp would box those value types into an object inside the union:
public struct MyUnion
{
// structs get boxed in here too.
private readonly object value;
}
Unsafe unions store their data as a fixed array corresponding to the size of the largest element:
public unsafe struct MyUnsafeUnion : IFlatBufferUnion<EightByteStruct, TenByteStruct, SixteenByteStruct>
{
// 16 corresponds to the size of the biggest struct in the union.
private fixed byte[16] value;
}
This allows MyUnsafeUnion
to carry all of the values for all of the value types in a single fixed element, without the need for boxing or allocations! There are a couple of caveats to this feature:
fs_unsafeUnion
will raise an error if there are any reference types in the union. Experimentally, performance degrades significantly when a union carries a mix of object
and fixed byte[]
elements.fs_unsafeExternal
value structs.fs_unsafeExternal
value structs, you can get runtime exceptions.<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
Unsafe unions offer reduced allocations and increased performance when traversing a vector:
Method | Mean | Error | StdDev | P25 | P95 | Code Size | Gen0 | Allocated |
---|---|---|---|---|---|---|---|---|
ParseAndTraverse_SafeUnionVector | 511.10 ns | 16.573 ns | 5.910 ns | 506.62 ns | 519.09 ns | 1,370 B | 0.0544 | 912 B |
ParseAndTraverse_UnsafeUnionVector | 396.93 ns | 12.903 ns | 3.351 ns | 394.61 ns | 401.33 ns | 1,339 B | 0.0076 | 128 B |
Even though FlatSharp 7 is a major release, there are not a ton of breaking changes if you are using the FlatSharp.Compiler
package, so upgrading should be straightforward.
IInputBuffer2
has been deleted and merged into IInputBuffer
. Assorted changes have been made to IInputBuffer
.Switch
methods on Unions have been deleted. Please use the new Accept
/Visitor
pattern instead.ISerializer<T>.WithSettings
now accepts a delegate instead of a settings object. This change is largely semantic but allows easier usage.ISerializer<T>
properties CSharp
, Assembly
, and AssemblyBytes
have been removed.ISerializer<T>.Parse
now accepts an optional DeserializationMode
argument.IList
, IReadOnlyList
, IIndexedVector
, Memory
, and ReadOnlyMemory
.fs_nonVirtual
attribute no longer has any meaning. All properties are virtual now.ubyte
now default to using Memory
instead of IList
if no type is specified.--input
parameter now accepts a semicolon-delimited list of files: dotnet FlatSharp.Compiler.dll --input "file1.fbs;file2.fbs;file3.fbs" --output .
FlatSharp.generated.cs
instead of SchemaName.fbs.generated.cs
. Note: You may experience build errors after upgrading due to type conflicts. Be sure and clean your obj
folder of generated files first.Full Changelog: https://github.com/jamescourtney/FlatSharp/compare/6.3.3...7.0.0