Advertisements

  • Advertisement
Photobucket
My Photo

Tip Jar

Support Blog

Tip Jar

Official Second Life Blog

EngageDigital

« When are we going to be done with AnneMarie Otoole?1 | Main | Happy Ramadan, and Happy Woman's Body! »

July 20, 2012

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451cfe069e20167689ee9e3970b

Listed below are links to weblogs that reference So Now We're to Blame it All on the Belkin Router?:

Comments

Prokofy Neva

Comment I put on Tateru's blog that ended up in moderation:

The back story, as I'm the one who pushed this for several years and finally got the Lindens' attention, and finally Monty went out and bought a Belkin to test, God bless him:

http://secondthoughts.typepad.com/second_thoughts/2012/07/so-now-were-to-blame-it-all-on-the-belkin-router.html

And the issue here is to remain patient and curious about software, and resist that geeky culture that always wants to blame the user and their crappy home routers and this or that brand which "everyone knows" is crappy. Because the real question is why this software, which has been in beta for 10 years, can't work with the regular consumer products available in the regular normal computer stores, yanno?

Loki flags a problem that any of us can experience even just buying a Belkin: that the ISP, like Verizon, then requires new settings, a personal phone call, jiggering, etc. etc. And that can talk half a day or more to accomplish.

Wolf may have his finger on the heart of the problem, "Second Life has to change how it uses bandwidth. Some of the Project Shining elements (The cache operation: did they never notice?) will pay off. But I do sometimes wonder if the Lindens have ever understood the Internet. HTTP textures, for instance: how often were we told to try turning that off?"

The Lindens need to change their software and stop torturing consumers.

c3

beta or obsolete...
but get the money first.
never gonna end..
virtualized futures awate the stupid.

BTW- sad MIPS day... 20 something reported going comic book violent at the batman premiere in colorado...

guns, video game media, comicbooks, and adulthood.

Carl Metropolitan

I gave up on the Linden Lab SL clients about six months ago, when both the released version and the beta version were crashing my machine every five minutes. I am sadly not exaggerating here.

I reluctantly moved to Firestorm, accepting the trade off of not being able to trust the developers for a client that actually ran. So far I've been happy with it.

Unfortunately, I can't recommend Firestorm for you, because you have a SL business to run and a number of enemies who would happily take advantage of access to Firestorm code to screw with you.

I used to think LL might have made a mistake in open sourcing their client. Now, I have to wonder if someone at the company realized that making a client that consistently worked for the majority of their users was either beyond or abilities--or of no interest to them.

Amanda Dallin

@Carl

I think they open sourced the client to get free labor from all those developers who gladly provided SL a variety of viewers. Open Source also fit the ideology of many of the Lindens back then. LLs corporate culture seems to be changing and becoming more professional but they (and we) have to live with all those early screwed up decisions.

Gwyneth Llewelyn

Just to add my two cents... Prokofy told me about this story yesterday. I had read Inara Pey's article which mentions the "strange behaviour" with some routers that Linden Lab admits that will be incompatible with future versions of their viewer, and I was quite curious about what exactly causes that incompatibility. I still remain curious, even more so these days, because it's not so easy to crash a router by piping traffic — HTTP traffic, at that! — through it. But apparently that's exactly what happens!

Now, I'm apparently not geeky enough to have anything against Cisco: I do own a Linksys router (now a Cisco brand), a WAG200G which has a label saying "Manufactured 05/2006" and another saying it's a "Home Gateway" — even though at the time I bought it, it used to be on the shelf for "small office/home office" appliances. I had previous good experiences with similar routers from Cisco/Linksys for the small office environment and never had any problem with them. The WAG200G is so old that Linksys doesn't offer more upgrades to its firmware (I think they stopped doing so in 2008 or 2009) but there is a small community of developers who continue to improve the firmware — which is open source, and, yes, as you guessed, it's just Linux, like the majority of routers these days. So in theory the difference between assembling your own self-made router and a shelf-based one is not the operating system: it's just that you'll have a larger box :) And of course you'll be able to buy whatever parts you wish for it. On the other hand, it's understandable that companies like Cisco or Belkin will assemble a router using whatever parts fit best into a tiny enclosure, and they're supposed to know what they're doing: I'm hardly convinced that they are going to pick broken parts for it! Also, I believe that most "self-built" small routers will use OpenWRT for their software (https://openwrt.org/) which can be installed on pretty much every small router out there — yes, even Belkin and Linksys ones! Larger hardware — i.e. a PC — can just run "normal" Linux on them, but I'd need an expert to tell me why "normal" Linux would avoid the kind of problems that "specialised, router-specific" Linux has. Tateru's comments regarding the difficulty of implementing certain protocols on routers might have applied to the ancient days when router manufacturers would develop their own operating systems, but, these days, everybody uses Linux. Everybody. The reason? Because Linux is free and *does* (allegedly) implement everything just right — and you can fit it into little memory (I think my Linksys has just 16 MBytes).

So, hmm, there is something lacking in this kind of argumentation which I'm probably missing. Again, I would find it very surprising that a router manufacturer would grab a copy of Linux — normal, specialised, embedded, whatever — deliberately change its networking subsystem to introduce "errors" and "badly supported protocols", and sell the package as a "home router", so that they could sell more expensive routers for "office users", using a flawless copy of Linux. I mean, it's not theoretically impossible to assume that, but I just find it hard to believe.

Nevertheless, my ancient WAG200G from Linksys also has problems connecting to SL. They're not as serious as what Prokofy reports, but they nevertheless exist: teleports failing, the connection being reset when logging in to a busy region, sudden spikes in traffic even though "nothing" is happening, way slow performance from the in-world browsers, and similar small (but annoying) faults. I never attributed this to the router, but just to something in SL. What exactly goes beyond me. I *can* look at the router's logs by logging in to it, and have caught it several times during the times it "hangs" when using SL, but never found anything remotely suspicious. Whatever is happening doesn't register on the logs. Strange, but true.

On the other hand, as Prokofy so well described, the SecondLife.log is crammed full with all sorts of errors. It has always been that way. It's the sole application with more errors on the log — even though Chrome comes as a close second. And, yes, Chrome also crashes the router sometimes, when loading some Flash games which I'm fond to play.

So mmh even though this goes against by better judgement, I can "believe" that the combination of certain software and the kind of traffic patterns it has can, indeed, crash a router (or reset it, or just make it drop connections, etc.). Oh, and it's not just a question of Wi-Fi — except for my old laptop, the rest of my home computers are all directly plugged into the router itself. So whatever is happening is not to blame on Wi-Fi, but on the combination of router and software. But I still find it very, very strange. There is not really much that can "go wrong" on a router to make it *crash* — specially if we're just talking about retrieving textures via HTTP. Which is 99% of what most routers do: retrieve web pages via HTTP. Lots of them. Specially on Javascript/AJAX-based, intensive websites like Facebook, Google+ and the like. This is what is supposed to be "normal behaviour": retrieving things via HTTP is what routers *do*. I find it incredibly strange that LL "suggests" to turn off HTTP retrieval and rely on UDP connections to avoid problems with the router: common sense would suggest exactly the reverse approach!

And it's a pity, really. The latest versions of the LL viewer have increased FPS performance on my outdated hardware threefold (like Prokofy, I use very low settings as well — no shadows, no fancy water, not even fancy WindLight sky, unless I really want to take a picture). I was relieved to think that I could continue to use my old computers for a few more years. Now apparently I will have to replace my router if I still wish to enjoy SL! That's... well, absurd :) The only good thing is that even a "small office" router will be cheaper than a new computer, even if it costs me US$200...

Prokofy Neva

Well, Gwyn, that's all very enlightening. I'd love to hear Belkin's side of the story. And some other Linux developer's side of the story outside the Lab.

Do tell me what kind of new router you buy.

elizabeth (16)

i use whatever modem/router my ISP gives me as part my package

that way if is a problem then when i call them up then they will tell me how to fix. they tell me that they optimise their end only for the modem/router they provide. if i use any other brand then they say sorry cant help. talk to manufacturer

at the moment i got a thomson speedtouch off them. it goes ok. is not a wireless one tho. just cable

Siana Gearz

None of the problem routers run Linux. Linux needs a reasonable amount of RAM and Flash memory, in particular about 4MB of Flash and 8 to 16MB of RAM.

When trying to make routers cheaper to manufacture, the manufacturer is inclined to save on those components and give the router a lot less memory. Typical for such low end configurations is for example VxWorks RTOS. It's hard to say what is the culprit - i consider it unlikely to be VxWorks itself, but it could be the custom software written on top of it, or it could be just the memory contention from being crammed into too tight fit of a device.

Saying that the routers don't have a problem is obviously wrong. If you need to reset or power-cycle a router once a week in normal use, something is very distinctly wrong with its software, and it's very probable to make it behave two and a half orders of magnitude worse just by subjecting it to a higher load. Most likely, it simply leaks memory. In home use, this is usually of little regard because the connection is reset every 24h from the network end (at least it is in my country), allowing the router to free its memory. Newer routers can still be buggy but not require manual resetting, because they can restart themselves if they ever lock up.

A simple recipy to make an average router suffer temporary failure: get a bittorrent client. Disable uTP protocol, if supported. Crank up number of connections from default (usually around 20). Try to download something. Many routers start failing at 50 connections or less, better ones fail at 200.

It's possible to make many routers run more stably by disabling stateful firewall and other features. Also not putting heavy use on WLAN can improve router stability - though i'm not sure why - memory? CPU time?

Also, handshake when establishing a connection is a heavy process, much more difficult than maintaining a connection. Both routers and operating systems have a limit on a number of "half-open" connections, i.e. connections which are in the state of being established. Windows XP will cut down connections if the number of half-open exceeds, i think, 10. The limits have since been increased.

For this reason, in Singularity i tweaked texture fetch, allowing it to keep up to 32 connections, but only within bandwidth envelope (approximate), and never try to start more than a handful of connections simultaneously, staggering the process of establishing connections over time. This hasn't proven particularly perfect, but for many users better than LL's strategy. Again, there is a bit of bad layering between the systems, which means the head doesn't really know what the tail is doing, so we have to guesstimate how not to go over the limit. Unfortunately we got extra issues evident in recent release, i suspect from merging recent LL changes.

Also a bug has been found that SL viewer would do frequent DNS requests (aka name requests) which translate names of services or websites into IP addresses, for the names that the viewer should already know and has a right to cache. This also requires routers to reserve memory for a reply, and in known cases has lead to router crashes.

However, chances are, weak routers can be made work in the long run. The whole process of establishing and tearing down connections for every single inventory bit or texture is unnecessary. It's a leftover from the ancient standard which was already declared obsolete last century, HTTP 1.0. Most of the Internet supports newer (finalised 1999) HTTP 1.1, which allows to retain the same connection to satisfy successive requests, and allows the client to place a new request even before the old one has finished. Monty Linden is currently working on upgrading the complete SL infrastructure to support the newer protocol. The result will be that a much smaller number of connections will be sufficient for a high degree of bandwidth utilisation, and the problematic process of establishing and tearing down connections is eliminated.

Another major issue with HTTP fetch in Second Life is the service blatantly lying and when an error occurs, often times responding with junk and a success result code, and otherwise responding with invalid or wrong result codes. This is also something they might fix eventually, i hope. At least Monty said something along the lines of "i was just appalled as you are when i discovered this" about a particular bug we discovered independently, so it looks like he has his eye on these things.

Before, it didn't even make sense to say something, because no-one would listen, would have the expertise or inclination to listen. I'm not sure what to hold of Oz Linden's CV, because it puts him as a network solutions professional, and yet, all he seemingly did at LL was trying to herd nekos, make horrible colour palettes, and write trivial scripts.

Gwyneth Llewelyn

But I'm a bit appalled that LL implemented HTTP 1.0. Ugh! I would have thought that to replace a mostly-UDP-based infrastructure, the first thing to do is to handle HTTP in the most persistent way possible. There are so many techniques for doing that, like long polling... but to the best of my knowledge, all these require the almost-15-years-old HTTP 1.1 protocol to be implemented. I hope Monty is seriously planning to upgrade their code to work with HTTP 1.1 well before the new 'developer version' becomes the 'stable' version :-P

Gwyneth Llewelyn

Gah! I lost almost everything from my other comment... hehe.

Oh well, it was very techy, and basically just saying that my Linksys router most definitely runs Linux, it has 16 MB of RAM (split among 14 MB for "memory", of which 1-1.5 MB are always free) and 2 MB of "disk" (100% full). Load average is between 0.1 and 0.2, briefly spiking to 0.4 when SL is launched, but then dropping back to 0.2 after a few minutes. There are few "extra" features turned on the router, except for a basic firewall (no complex filtering turned on). The processes taking the most amount of memory are those from UPnP, but I'm afraid to turn UPnP off — that usually gives some NAT problems with some specific applications, like video streaming and such.

So from the software side of things, the Linksys router is not catastrophically bad. Of course, it might have that problem with the VxWorks chip — which, no matter how good the software is, there might be no way to go around it. I actually mentioned this issue to some hardware geeks, running a shop for 25 years, and, pretty much like me, they were fond of recommending Linksys routers for "customers who don't want to have router problems, ever". They were as much surprised as I was to learn that there is now a replicable way of making a Linksys router drop an ADSL connection just by using software — SL in this case — which opens multiple HTTP connections. Surprised... and a bit incredulous.

But, alas, the evidence seems to be clear...

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Advertisements

  • Advertisement

Advertisements

  • Advertisement
Blog powered by Typepad

Networked Blogs

  • Networked Blogs