Active Topics

 


Reply
Thread Tools
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#1
If you search with Google (using the search bar on the browser, the Google Widget on the desktop or even going straight to the Google homepage and searching that way, Google returns invalid XHTML data for any search where the search results contain a "places" section and the "places" section contains an entry with an & sign.

More specifically, if there is an & sign in the name of an entry under "places", it is not correctly being converted into & as required by the XHTML specification.

For reference https://drive.google.com/file/d/0B9i...ew?usp=sharing is the exact HTML Google serves up for one of the search queries I am using to test (including the broken & symbol) and I can confirm that both microb and a modern gecko based browser choke on this file with the exact same parsing error. (if you look at the raw XHTML source you can clearly see the string George's Fish & Chips instead of George's Fish & Chips like it should be).
A normal search on all the desktop browsers I have installed shows that the correct & is present there.

As Google Searches on the go (e.g. to find local examples of a certain type of business when I am out and about) is one of the major reasons I love my Nokia N900, I would like to find a solution to this problem.

Installing another browser isn't an option (especially since this will affect anyone using a N900 with MicroB and not just me)

Options:
1.Get Google to fix the output they return so the & symbol is properly converted to &. Likely to be difficult (finding a way to report it to Google, getting them to care about a browser and platform this old then getting them to actually be willing to fix the issue as)
2.Add something to microb-engine (in cssu or via some sort of add-on or other change) that detects the bogus Google pages and properly converts the &s into & before the page is rendered. Would probably be hard to do without deep level knowledge of the Gecko rendering engine internals (and not just the Gecko internals but the internals of the very old fork of Gecko we are using on the N900)
or 3.Modify libexpat (in microb-engine) so it treats an & followed by a space character as valid instead of returning a parsing error. This is the easiest way to go but I dont know if it will cause any failures elsewhere (although I doubt it would)

I am currently running a 3-line patch (attached to this post) that implements option 3 and so far (based on limited testing) it solves the Google Search issue (the test xhtml file above loads fine) and no side-effects have been observed.

So what I am looking for is feedback on where to go from here?
Is there a way to get Google to fix its output?
Is my 3-line patch the best solution?
Is there another way to solve this?
Attached Files
File Type: txt microb.diff.txt (546 Bytes, 119 views)
 

The Following 5 Users Say Thank You to jonwil For This Useful Post:
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#2
Its been pointed out in IRC that A.Changing microb to work around a bug in Google is a bad idea and B.The fact that an unescaped & sign can possibly appear in a Google results page is a potential security risk.

So the patch on this bug should be ignored (although people can run local copies with that patch if they want Google to work in the short term) and Maxdamantus on IRC has submitted (or will submit) a security report to Google describing the details and hopefully Google can sort it out.
 

The Following 6 Users Say Thank You to jonwil For This Useful Post:
Posts: 14 | Thanked: 69 times | Joined on Jan 2015 @ New Zealand
#3
Originally Posted by jonwil View Post
Maxdamantus on IRC has submitted (or will submit) a security report to Google describing the details and hopefully Google can sort it out.
The guy from the security team responding to my report has reproduced it and filed a bug report, so it should be fixed some time.
 

The Following 9 Users Say Thank You to Maxdamantus For This Useful Post:
Posts: 1,203 | Thanked: 3,027 times | Joined on Dec 2010
#4
i've been getting xml parser errors for quite a while now, just figured it was something out of date on phone.

hopefully this gets resolved asap their end.
 
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#5
Good to know that Google is going to investigate it
 
jellyroll's Avatar
Posts: 435 | Thanked: 684 times | Joined on Apr 2012 @ Netherlands 020
#6
Is this already fixed? I couldn't use Google for a while.
 
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#7
Still broken for me (unless I use the microb hack patch)
Given that its the holidays and its not exactly a mission critical fix, lets give Google some time to fix it ok
 

The Following 2 Users Say Thank You to jonwil For This Useful Post:
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#8
Seems like Google doesn't consider this issue serious (since it doesn't appear to be a security issue). I will continue using the microb patch for now and maybe we should consider putting that patch into CSSU (a slightly hackish patch to microb-engine that treats an & followed by a space as thought it was really & followed by a space seems better for users than wierd errors in Google half the time)
 

The Following 3 Users Say Thank You to jonwil For This Useful Post:
Posts: 567 | Thanked: 2,965 times | Joined on Oct 2009
#9
Given that Google doesn't seem inclined to fix this issue anytime soon, I have committed the local patch I have to CSSU Git (although mostly I did it just to get it out of my tree so I wouldn't loose it when fiddling with NSS stuff)
 
Reply

Thread Tools

 
Forum Jump


All times are GMT. The time now is 08:10.