Friday, March 16, 2007

Why Web Security is a Difficult Problem To Solve

There has been a discussion on how common filters implemented to block XSS are likely to fail. The most common action taken against the user input is converting the characters that could possibly define a tag, event handler etc. into HTML entities. This allows for the input to be displayed correctly but without it being executed.

E.g.
< is converted to &lt;
> is converted to &gt;
& is converted to &amp;amp;amp;amp;amp;amp;
" is converted to &quot;

As RSnake pointed out, its a common mistake not to use g flag in your reg x
So instead of $var =~ s/"/&quot;/;
you need to use $var =~ s/"/&quot;/g;
etc.


But wait, does it really end here?
Given that the web applications uses a lot of other layers (I mean databases, os shells [commands], xml files and what not), filtering for one type of vulnerability may result into another. Especially the sequence of filtering matters a lot.

To explain it more clearly, lets take an example of an imaginary web page that executes following pseudo code.

$input = $_GET['input'];
sanitize($input);
echo "hi $input";
echo `app.exe $input`;

function sanitize($input)
{
replace_all($input, "&|\"|;", ""); // avoid command injection
replace_all($input, "<", "&lt;"); // no XSS
replace_all($input, ">", "&gt;"); // no XSS
return $input;
}

Perfect isn't it? Well absolutely not.

Look at the function 'sanitize',
Line 1 replaces all & with a blank in order to avoid command injection.
But line 2 and 3 actually adds an &amp;amp;amp;amp;amp; and ;

Now when input goes as an argument to app.exe, command is gonna execute!

So input like xyx<ls -l
is going to become xyz&lt;ls -l and is going to get executed.


This is just an example where only 2 layers were involved (html output, OS shell). But I'm sure there are thousands of combinations like that. And in many cases, command execution will not follow echo "hi $input" immediately. So it will be hard to visualize this case everytime. But it still is going to exist.

As a general rule, I would filter input for a particular layer just before it enters that layer.

Anyone aware of other combinations?
Trackbacks: Forgetting Global Replace XSS Woes

5 comments:

Anonymous said...

nice article

the '&' gets repeated several times in your article text.

gives something like '& amp;amp;amp;amp; ' ......

that makes the perfect example about why web security is hard to implement, and a good illustration about how poorly some blog systems like blog.com try to sanitize user input.

Yoyersteiner said...

Hmm... yes, that example is vulnerable to command line injection, but not because of bad filtering, but because of no filtering. Just calling your sanitize() function does not modify the variable content.

(why I can't use <code> here?)

Kishor said...

Yoyersteiner, its not php/perl code. Its a PSEUDO code.

And
replace_all($input, "&|\"|;", ""); // avoid command injection

is trying to filter command injection.

Anonymous said...

why don't use the escapeshellarg function ?

chandra said...

You have some really great articles.
As a web developer I really know its pretty difficult and at the same time challenging task to develop fool proof web apps.