SiteExperts.com Logo Home | Community | Developer's Paradise
User Groups | Site Tools | Site Information | Search
 Main Menu
 Forums
SiteExperts.com Forums
All Discussions

SiteExperts Feedback
The Lounge
Dynamic HTML
Site Design/ Critiques
HTML and CSS
XML Technologies
The Wireless Internet
Internet Explorer
Microsoft .NET
The Server
Technical Support

Sponsored Links

User Groups : Forums : SiteExperts : XML Technologies :

Previous DiscussionNext Discussion
 Removing Hexadecimal Character from XML File

Hello All,
 
    I got one problem in Parsing XML File.   
   
    In our application we are reading an XML File for input using XMLText reader in C#.
 
    Now in that XML file there is an Hexadecimal character  and its giving following error while reader.Read() method.
 
    'The File is not a proper xml file parse error is '', hexadecimal value 0x07
 
    Its an urgent issue can anyonoe suggest a way to remove this character before parsing or using some character set so as to read it or eliminate it.
 
    Currently we for testing just removed that character using StreamReader but thats also a memory consuming approach as file size can be large also.
 
    It would be great if you can help me out in this as soon as possible
 
    Regards
   
    Kunal Shah
   
    9342265409

Started By kunalshah on Feb 1, 2006 at 9:48:34 PM

9 Response(s) | Reply

Earlier Replies | Replies 3 to 9 of 9 | Later Replies
Goto Page: 2 1
viral.pandya on Jul 15, 2007 at 10:23:01 PM (# 3)

me to have same problem, and i am to deal with a 24gb file which i can not open manually and remove the char
plz help both of us if anybody can


MHenke on Jul 17, 2007 at 6:12:45 AM (# 4)

Same answer.

An XML processor is an XML processor and won't process non-XML data.

Pre-processing doesn't necessarily mean to do it manually. Choose any (skript-) language to read in the file chunk by chunk, remove invalid code points and re-write it as XML.


SRIuri on Sep 19, 2007 at 2:22:39 PM (# 5)

Removing Hexadecimal Character from XML File

 

Could any body reply to this please..it is urgent...Kunal Shah did you find any solution


MHenke on Sep 19, 2007 at 11:11:20 PM (# 6)

Yep, someone can, and someone already did. What I wrote above is an objective fact. Maybe inconvenient to you, but ignoring it won't change the XML spec.


bod1467 on Sep 20, 2007 at 1:25:05 AM (# 7)
This message has been edited.

Some people are too stubborn to believe what they are told - they'll always assume they are right. :-)


navyjax2 on Jan 16, 2009 at 3:30:16 PM (# 8)
This message has been edited.

Well it would be nice to find out how to remove said character before the XML is written.  I have the same issue, just with the character 0x00.

My character gets put into a text file with the content I am trying to write to XML when I convert a PDF to text.  I read that text file in with a StreamReader in C# to a string variable.  I was then going to have that string variable written into a "WriteCData" element using XmlWriter, but then this issue arose.

Does someone have code to use StreamReader to check the content of a text file for illegal hex characters and replace them with ASCII equivalents, and return the content as a string variable?  I'm sure we'd all be very grateful - I know I would.

Thanks in advance,

Tom


navyjax2 on Jan 16, 2009 at 5:17:25 PM (# 9)

I actually, I found something I was able to modify to my needs on another site.  Here's what I came up with:

private static string charScrubber(string content)

{

     StringBuilder sbTemp = new StringBuilder(content.Length);

     foreach (char currentChar in content)

     {

          if ((currentChar != 127 && currentChar > 1))

          {

               sbTemp.Append(currentChar);

          }

     }

     content = sbTemp.ToString();

     return content;

}


Earlier Replies | Replies 3 to 9 of 9 | Later Replies
Goto Page: 2 1

To respond to a discussion, you must first logon.

If you are not registered, please register yourself to become a member of the SiteExperts.community.

User Name
Password
Copyright 1997-2004 InsideDHTML.com, LLC. All rights reserved.