I want the input string "add [7,8,9+5,'io open'] 7&4 67"
to be split like ['add', "[7,8,9+5,'io open']", '7&4', '67']
, i.e, within the line, strings must remain within quotes and musn't be split at all , and otherwise whitespace based splitting is required, like so :
>>> import shlex
>>> shlex.split("add [7,8,9+5,\\'io\\ open\\'] 7&4 67")
['add', "[7,8,9+5,'io open']", '7&4', '67']
But the user shouldn't have to use the \\
if possible, at least not for quotes but if possible not for in-string whitespace too.
What would a function break_down()
that does the above look like ? I attempted the below, but it doesn't deal with in-string whitespace :
>>> import shlex
>>> def break_down(ln) :
... ln = ln.replace("'","\\'")
... ln = ln.replace('"','\\"')
... # User will still have to escape in-string whitespace
... return shlex.split(ln) # Note : Can't use posix=False; will split by in-string whitespace and has no escape seqs
...
>>> break_down("add [7,8,9+5,'io\\ open'] 7&4 67")
['add', "[7,8,9+5,'io open']", '7&4', '67']
>>> break_down("add [7,8,9+5,'io open'] 7&4 67")
['add', "[7,8,9+5,'io", "open']", '7&4', '67']
Maybe be there's a better function/method/technique to do this, I'm not very experienced with the entire standard library yet. Or maybe I'll just have to write a custom split()
?
EDIT 1 : Progress
>>> def break_down(ln) :
... ln = r"{}".format(ln) # escape sequences don't require \\
... ln = ln.replace("'",r"\'")
... ln = ln.replace('"',r'\"')
... return shlex.split(ln)
So now the user only has to use a single \
to escape any quotes/spaces etc , kind of like they would in a shell. Seems workable.