Ngã†â°ã¡â»âi Yãƒâªu Tãƒâ´i Ä‘ã¡â»âu Mã¡âºâ¯c Bã¡â»â€¡nh Thã¡âºâ§n Kinh Review
Answer
The thing is that by definition the multi-byte groups are not ASCII. When a byte (as you lot read the file in sequence one byte at a fourth dimension from first to stop) has a value of less than decimal 128 then it IS an ASCII character. But if when yous read a byte and it'southward anything other than an ASCII character information technology indicates that it is either a byte in the centre of a multi-byte stream or it is the 1st byte of a mult-byte string. In order to even endeavour to come up with a straight conversion you'd almost accept to know the language (folio code) that is in use on the computer that created the file. It may be using Turkish while on your auto you lot're trying to translate into Italian, then the same characters wouldn't fifty-fifty appear properly - but at least they should appear improperly in a consequent manner. I retrieve you're just going to have to sit down down and spend a lot of time 'decoding' what you lot're getting and create your own tabular array. Either that or get with who always owns the system building the files and tell them that they are NOT sending out pure ASCII comma separated files and ask for their aid in deciphering what you are seeing at your finish. If you bank check out the read on this page http://www.joelonsoftware.com/articles/Unicode.html yous'll find that the author is of the stance that lots of people (developers) Retrieve they know about character sets, but are actually well-nigh as clueless every bit I by and large am almost them. Past the way - the five and six byte groups were removed from the standard some years ago. Did you try running a examination file through my code and looking at the output to run into if it even looked reasonably shut? Unless they're doing something strange at their end, 'standard' characters such as the apostrophe shouldn't even be inside a multi-byte group. An apostrophe ' has ASCII decimal value of 39, while the grave ` has an ASCII decimal value of 96. Here's the unabridged ASCII character set - some such as 7 (bell) and 10 and xiii are not-printable since near below decimal value 27 are considered to exist "control" codes. You'll see that nix is really visible until 41 - the ! mark, although the character at the right stop of the row above is a true ASCII space character (right above the ( symbol). 1 ! " # $ % & ' ( ) * + , - . / 0 ane two 3 four 5 6 vii 8 9 : ; < = > ? @ A B C D Due east F G H I J K L Thou N O P Q R S T U V W Ten Y Z [ \ ] ^ _ ` a b c d eastward f g h i j k l m n o p q r s t u v westward x y z { | } ~ 127 I cannot believe that this organisation threw away the unabridged filigree and aligned them upwardly as a unmarried string of text characters! :( and even though information technology says it's accepting a graphic of the grid - it's not showing up either! And information technology seems to have removed all of the line feeds in the post making ane huge paragraph out of what was written as at least 6 separate paragraphs.
I am gratis because I know that I alone am morally responsible for everything I do. R.A. Heinlein
4 people found this respond helpful
·
Was this answer helpful?
Sorry this didn't help.
Great! Thanks for your feedback.
How satisfied are you with this reply?
Thanks for your feedback, it helps united states amend the site.
How satisfied are you with this reply?
Thanks for your feedback.
reaganbrigingening95.blogspot.com
Source: https://answers.microsoft.com/en-us/msoffice/forum/all/translating-unusual-characters-back-to-normal/a819a82f-dad8-4072-bb05-49c087d4d6b2
0 Response to "Ngã†â°ã¡â»âi Yãƒâªu Tãƒâ´i Ä‘ã¡â»âu Mã¡âºâ¯c Bã¡â»â€¡nh Thã¡âºâ§n Kinh Review"
Post a Comment