Creating associative or hash arrays in bash using sed and strings without the use of arrays, looping and conditionals.

Hashes are a certainly very important part of any language. If you're not used to hashes, you may not see their potential at first. However, having used them in several languages now, hashes always ended up reducing my code significantly especially when only complex solutions would only do otherwise. However, bash or ksh for that matter, don't really come with such a construct. Reading up on how to do such a thing really didn't provide any elegant solutions that I wanted for most scenarios. What I really wanted is something that had these features:

I needed to define my hashes easily similar to the pearl construct %hash= { "key" => "value", "key1" => "value2" }
I needed easy key / value retrieval without excessive code and something with minimal implementation that's easy to use in looping and conditional constructs.
Finally, some flexibility so I can have the freedom to define various kind of hashes depending on my needs.
Avoid (NOT) using arrays (define -a array1), conditionals (if, case ) and loops (for, while, do)

Well what I did is using a combination of strings and the unix sed utility to accomplish all of the above in the below script (hashv retrieves values from keys. hashk retrieves keys from values):

#!/bin/bash

mhash="Jan:01 Feb:02 Mar:03 Apr:04 May:05 Jun:06 Jul:07 Aug:08 Sep:09 Oct:10 Nov:11 Dec:12";

function hashv {
        hkey="";
        mh="";
        if [[ $2 != "" ]]; then hkey=$2; else echo ""; return 0; fi
        if [[ $1 != "" ]]; then mh=$1; else echo ""; return 0; fi

        echo $mh|sed -e "s/.*\([ \t]*\)\($hkey\):\([^ \t]*\?\)\([ \t]*\).*/\3/gi"
}

function hashk {
        hvalue="";
        mh="";
        if [[ $2 != "" ]]; then hvalue=$2; else echo ""; return 0; fi
        if [[ $1 != "" ]]; then mh=$1; else echo ""; return 0; fi

        echo $mh|grep "$hvalue"|sed -e "s/\([^ \t]*\)[:]\($hvalue\)/\|\1|\2\|/i" -e "s/.*\?[|]\(.*\)[|].*[|].*\?/\1/i";
}

echo "____________________________________";
hashv "$mhash" "Dec";
hashk "$mhash" "12";
echo "____________________________________";
hashv "$mhash" "";
hashk "" "12";
echo "____________________________________";
hashv "$mhash" "Oct";
hashk "$mhash" "10";
echo "____________________________________";
hashv "$mhash" "Jan";
hashk "$mhash" "01";
echo "____________________________________";

Saving to a file hash.bash, the above gives the output:

# ./hash.bash
____________________________________
12
Dec
____________________________________


____________________________________
10
Oct
____________________________________
01
Jan
____________________________________

which is what I was looking for. It turns out that the above code is more flexible in it's implementation then I really had hoped for. If spaces and colon (:) isn't really satisfactory as a delimeter, I can easily change it to something else I know I won't be using. The string definition makes defining hashes easy, even more so then the perl counterpart I was used to (though, yes, probably not as powerfull) Which brings up another interesting point, I also have the ability to change the behaviour of this type of hash definition something I don't get in other languages.

Hope you found this useful! Enjoy and don't hesitate to write below!

This entry was posted on Friday, February 6th, 2009 at 4:13 am and is filed under NIX Posts. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

4 Responses to “Creating associative or hash arrays in bash using sed and strings without the use of arrays, looping and conditionals.”

BASH, HASH and AWK - The UNIX and Linux Forums on February 7th, 2010 at 6:52 am

[…] Might not be exactly what you’re looking for but… With regards to above, you could try this: Creating associative or hash arrays in bash using sed and strings without the use of arrays, looping… You’ll have to tweak that code for your tastes though to use different delimeters depending on […]
Arun Sangal on April 15th, 2011 at 2:44 pm

The above didn’t work for the following example:

[user@BS001 stage]$ cat gigahash.sh
#!/bin/bash

mhash=”abcd:55 bcda:4422 cdab:321 dabc:2131 eabcd:11″;

function hashv {
hkey=””;
mh=””;
if [[ $2 != “” ]]; then hkey=$2; else echo “”; return 0; fi
if [[ $1 != “” ]]; then mh=$1; else echo “”; return 0; fi

echo $mh|sed -e “s/.*$[ \t]*$$$hkey$:$[^ \t]*\?$$[ \t]*$.*/\3/gi”
}

function hashk {
hvalue=””;
mh=””;
if [[ $2 != “” ]]; then hvalue=$2; else echo “”; return 0; fi
if [[ $1 != “” ]]; then mh=$1; else echo “”; return 0; fi

echo $mh|grep “$hvalue”|sed -e “s/$[^ \t]*$[:]$$hvalue$/\|\1|\2\|/i” -e “s/.*\?[|]$.*$[|].*[|].*\?/\1/i”;
}

echo “____________________________________”;
hashv “$mhash” “eabcd”;
hashk “$mhash” “11”;
echo “____________________________________”;
hashv “$mhash” “”;
hashk “” “11”;
echo “____________________________________”;
hashv “$mhash” “bcda”;
hashk “$mhash” “4422”;
echo “____________________________________”;
hashv “$mhash” “abcd”;
hashk “$mhash” “2131”;
echo “____________________________________”;
hashv “$mhash” “cdab”;
hashk “$mhash” “55”;
echo “____________________________________”;

[user@BS001 stage]$

Now when I run the above script, “11” in
“11
dabc
” is wrong. i.e. “hashv “$mhash” “abcd”;” failed to print the correct value of 55.. it printed 11 (which is for eabcd hash key’s value). Looks like a little tweak to the above funcations would correct it.

[user@BS001 stage]$ ./gigahash.sh
____________________________________
11
eabcd
____________________________________

____________________________________
4422
bcda
____________________________________
11
dabc
____________________________________
321
abcd
____________________________________
[user@BS001 stage]$
Arun K Sangal on April 15th, 2011 at 3:24 pm

OK..corrected the logic for hash array. Now the below script will run like a 1969 pony ass. It’s showing “55” now for abcd index …and also not worrying about “eabcd” index (as this index contains “abcd” in the string). See the comment in script for “^” and “$” characters usage in grep.

[user@BS001 stage]$ cat ./gigahash.sh
#!/bin/bash

mhash=”abcd:55 bcda:4422 cdab:321 dabc:2131 eabcd:11 giga:aks”;

function hashv {
hkey=””;
mh=””;
if [[ $2 != “” ]]; then hkey=$2; else echo “”; return 0; fi
if [[ $1 != “” ]]; then mh=$1; else echo “”; return 0; fi

## Note the placement of “^” character in grep.
echo $mh | sed “s/ /\n/g”| grep “^${hkey}” | cut -d’:’ -f2
}

function hashk {
hvalue=””;
mh=””;
if [[ $2 != “” ]]; then hvalue=$2; else echo “”; return 0; fi
if [[ $1 != “” ]]; then mh=$1; else echo “”; return 0; fi

## Note the placement of “$” character in grep.
echo $mh | sed “s/ /\n/g”| grep ${hvalue}$| cut -d’:’ -f1
}

echo “____________________________________”;
hashv “$mhash” “eabcd”;
hashk “$mhash” “11”;
echo “____________________________________”;
hashv “$mhash” “”;
hashk “” “11”;
echo “____________________________________”;
hashv “$mhash” “bcda”;
hashk “$mhash” “4422”;
echo “____________________________________”;
hashv “$mhash” “abcd”;
hashk “$mhash” “2131”;
echo “____________________________________”;
hashv “$mhash” “cdab”;
hashk “$mhash” “55”;
echo “____________________________________”;

[user@BS001 stage]$ ./gigahash.sh
____________________________________
11
eabcd
____________________________________

____________________________________
4422
bcda
____________________________________
55
dabc
____________________________________
321
abcd
____________________________________
[user@BS001 stage]$
Arun Sangal on April 15th, 2011 at 4:11 pm

Scripts logic for hashv and hashk funcations in the following lines is wrong in the sense that it’ll not capture all index/values if an index1 string is contained in another index2 string or if value1 string is contained in some other value2 string.

The following line
echo $mh|sed -e “s/.*$[ \t]*$$$hkey$:$[^ \t]*\?$$[ \t]*$.*/\3/gi”

in hashv() should be changed to:
echo $mhash | sed “s/ /\n/g”| grep “^${hkey}:” | cut -d’:’ -f2

AND

The following line
echo $mh|grep “$hvalue”|sed -e “s/$[^ \t]*$[:]$$hvalue$/\|\1|\2\|/i” -e “s/.*\?[|]$.*$[|].*[|].*\?/\1/i”;

in hashk () should be changed to:
echo $mhash | sed “s/ /\n/g”| grep “:${hvalue}$” | cut -d’:’ -f1

Then, with any set of index:value pair i.e. a:1 or aa:11 or ab:2 abab:23 or cd:2323 will work correctly.

You must be logged in to post a comment.

Thoughts and Scribbles | MicroDevSys.com