Archive for the ‘csharp’ tag
Serializing XML in .NET - UTF-8 and UTF-16
When working with xml to object mapping, most modern languages have powerful tools or libraries that serialize and deserialize objects for you automatically, or even create classes for you based on xml schema definitions (XSDs). In the .NET world, these classes reside in the System.Xml.Serialization namespace. There is plenty of documentation available on how to use it.
However, you might encounter an issue when trying to serialize your object to XML, especially if you use a StringWriter to serialize your object to an XML string instead of a file. Since .NET strings are always stored in UTF-16, your resulting xml file will have the encoding of UTF-16. One way to get around this is by creating a MemoryStream, creating a StreamWriter, and applying the UTF-8 encoding to that StreamWriter.
The easy way of serializing an object to xml as a string (not to a file):
XmlSerializer serializer = new XmlSerializer(typeof(YourObject)); System.IO.StringWriter sWriter = new System.IO.StringWriter(); serializer.Serialize(sWriter, yourObjectInstance); sWriter.Flush(); string result = sWriter.ToString();
But since this will result in an xml file with encoding UTF-16, we have the following block:
XmlSerializer serializer = new XmlSerializer(typeof(YourObject)); // create a MemoryStream here, we are just working // exclusively in memory System.IO.Stream stream = new System.IO.MemoryStream(); // The XmlTextWriter takes a stream and encoding // as one of its constructors System.Xml.XmlTextWriter xtWriter = new System.Xml.XmlTextWriter(stream, Encoding.UTF8); serializer.Serialize(xtWriter, yourObjectInstance); xtWriter.Flush(); // go back to the beginning of the Stream to read its contents stream.Seek(0, System.IO.SeekOrigin.Begin); // read back the contents of the stream and supply the encoding System.IO.StreamReader reader = new System.IO.StreamReader(stream, Encoding.UTF8); string result = reader.ReadToEnd();
Remember, the MemoryStream is just a stream of bytes and doesn’t have to be text. So you create a StreamReader to read it back and have to tell the reader what encoding to use.
The idea of strings represented as bytes out in the wild, and various encodings, can be a little daunting for newer developers, but as long as you keep your encodings consistent, you should have no problems. And in the case that you need to convert to a different encoding, there are built-in libraries that do that for you.
WPF Weekend, Part 1: Introducing CID
Over lunch with my friends at OpenSource Connections last Friday, we had a discussion of how they could attract more people to their booth at the FOSE convention next week in Washington, DC. A recurring suggestion was to have something to dazzle people from a distance.One of the ideas we came up with is to have an eye-catching display, but that only had limited value.
So we decided to show attendees information that is immediately useful to them, such as the conference event schedule.SO this weekend, Michael Herndon from OSC, and myself, started to build what would become CID.
We decided from the beginning to open source the project and host it on Google Code, so that others can benefit and learn from our experience, and also so that it could be extended easily in the future. This series of posts outlines some of the things we learned in the process.