MCTS Exam 70-536 – Encoding and Decoding
Good Day,
normally I would write about Regular Expressions today, but I need a little bit more preparation on this topic. So I will skip it for now, and work on Encoding and Decoding.
General
- ASCII (American Standard Code for Information Interchange) is the foundation of the encodings
- ASCII is 7 bit (0-127)
- It includes English letters in upper and lower case, punctuation, numbers and some special characters
- It does not include non-English characters
- ANSI defined code pages for different character-sets using 8-bit (0-127 as ASCII, 128-255 special)
- ASCII and ISO 8859 (ANSI) encodings are being replaced by Unicode, a massive code-page
- Default of .NET Framework is Unicode UTF-16 (In some cases UTF-8)
- System.Text namespace contains classes that provide encoding and decoding (UTF-32 encoding, UTF-16 encoding, UTF-8 encoding, ASCII encoding, ANSI/ISO encoding)
- UTF-8 and UTF-7 are backward compatible with ASCII-Encoding, UTF-16 and UTF-32 are not.
Encoding classes
- use System.Text.Encoding.GetEncoding to get a special encoding object
- use Encoding.GetBytes to convert a Unicode string to its byte representation
- To Explore Code Pages use Encoding.GetEncodings to get a list of EncodingInfo Objects which represent available encodings
Specify Encoding when reading or writing a file
- Use the overload of the StreamWriter- and StreamReader-Constructor accepting Encoding-Object
- Typically you don’t need to specify the encoding type when reading a file. Framework detects automatically.
- For Example writing “Hello Kitty!!” to a file has the following sizes when choosing the encoding:
- UTF-7 : 19 bytes
- UTF-8 : 18 bytes
- UTF-16: 32 bytes
- UTF-32: 64 bytes
- Notepad can not read UTF-7 and UTF-32 files correctly
So far for now, next time I learn about Collections and Generics


