Thursday, July 22, 2010

url encoding using sed

It's easy in perl, 's/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg;', how about with sed? I found this page. Personally, I thought the page full of s/// operators a bit clumsy.

Here it is with the look-up table idea from here.



#!/usr/bin/sed -f

# Brad Forschinger, 2010-07-23

s/%/%25/g
:loop
/[][ <TAB>!"#\$&'()*+,.\/:;<>?@\\^_`{|}~-]/ {
s/$/%%BNJF 20<TAB>09!21"22#23\$24\&26'27(28)29\*2A+2B,2C-2D\.2E\/2F:3A;3B<3C>3E?3F@40\[5B\\5C\]5D\^5E_5F`60{7B|7C}7D~7E/
s/\(.\)\(.*\)%%BNJF.*\1\([0-9A-F][0-9A-F]\).*/%\3\2/g
b loop
}



Note, <TAB> is a literal tab.


% echo 'nom] !"#\$&'\''()*+,.\/:;<>?@\\^_`{|}~-[nom' | ./urlencode.sed
nom%5D%20%09%21%22%23%5C%24%26%27%28%29%2A%2B%2C%2E%5C%2F%3A%3B%3C%3E%3F%40%5C%5E%5F%60%7B%7C%7D%7E%2D%5Bnom
%


% can't be included due to the iterative approach, hence the initial s///.

Enjoy!

No comments: