A toy compiler that can convert Python scripts 🐍 to pickle bytecode 🥒
A small compiler that can convert Python scripts to pickle bytecode.
No third-party modules are required.
Using pip:
$ pip install pickora
From source:
$ git clone https://github.com/splitline/Pickora.git
$ cd Pickora
$ python setup.py install
Compile from a string:
$ pickora -c 'from builtins import print; print("Hello, world!")' -o output.pkl
$ python -m pickle output.pkl # load the pickle bytecode
Hello, world!
None
Compile from a file:
$ echo 'from builtins import print; print("Hello, world!")' > hello.py
$ pickora hello.py # output compiled pickle bytecode to stdout directly
b'\x80\x04\x95(\x00\x00\x00\x00\x00\x00\x00\x8c\x08builtins\x8c\x05print\x93\x94\x94h\x01\x8c\rHello, world!\x85R.'
usage: pickora [-h] [-c CODE] [-p PROTOCOL] [-e] [-O] [-o OUTPUT] [-d] [-r]
[-f {repr,raw,hex,base64,none}]
[source]
A toy compiler that can convert Python scripts into pickle bytecode.
positional arguments:
source source code file
optional arguments:
-h, --help show this help message and exit
-c CODE, --code CODE source code string
-p PROTOCOL, --protocol PROTOCOL
pickle protocol
-e, --extended enable extended syntax (trigger find_class)
-O, --optimize optimize pickle bytecode (with pickletools.optimize)
-o OUTPUT, --output OUTPUT
output file
-d, --disassemble disassemble pickle bytecode
-r, --run run (load) pickle bytecode immediately
-f {repr,raw,hex,base64,none}, --format {repr,raw,hex,base64,none}
output format, none means no output
Basic usage: `pickora samples/hello.py` or `pickora --code 'print("Hello, world!")' --extended`
pickle
opcodes)val = dict_['x'] = obj.attr = 'meow'
x += 1
(x := 1337)
a, b, c = 1, 2, 3
f(arg1, arg2)
from module import things
(directly using STACK_GLOBALS
bytecode)STACK_GLOBAL
GLOBAL
INST
OBJ
NEWOBJ
NEWOBJ_EX
BUILD
-e
/ --extended
option)Note: All extended syntaxes are implemented by importing other built-in modules. So with this option will trigger
find_class
when loading the pickle bytecode.
obj.attr
(using builtins.getattr
only when you need to "load" an attribute)operator
module)
+
, -
, *
, /
etc.not
, ~
, +val
, -val
0 < 3 > 2 == 2 > 1
(using builtins.all
for chained comparing)list_[1:3]
, dict_['key']
(using builtins.slice
for slice)builtins.next
, builtins.filter
)
operator.not_
operator.truth
(a or b or c)
-> next(filter(truth, (a, b, c)), c)
(a and b and c)
-> next(filter(not_, (a, b, c)), c)
import module
(using importlib.import_module
)lambda x,y=1: x+y
types.CodeType
and types.FunctionType
There are currently 4 macros available: STACK_GLOBAL
, GLOBAL
, INST
and BUILD
.
STACK_GLOBAL(modname: Any, name: Any)
Example:
function_name = input("> ") # > system
func = STACK_GLOBAL('os', function_name) # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077
Behaviour:
GLOBAL(modname: str, name: str)
Example:
func = GLOBAL("os", "system") # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077
Behaviour:
Simply write this piece of bytecode: f"c{modname}\n{name}\n"
INST(modname: str, name: str, args: tuple[Any])
Example:
command = input("cmd> ") # cmd> date
INST("os", "system", (command,)) # Tue Jan 13 33:33:37 UTC 2077
Behaviour:
args
by orderf'i{modname}\n{name}\n'
BUILD(inst: Any, state: Any, slotstate: Any)
state
is forinst.__setstate__(state)
andslotstate
is for setting attributes.
Example:
from collections import _collections_abc
BUILD(_collections_abc, None, {'__all__': ['ChainMap', 'Counter', 'OrderedDict']})
Behaviour:
inst
(state, slotstate)
(tuple)BUILD
RTFM.
It's cool.
No, not at all, it's definitely useless.
Yep, it's cool garbage.
if
/ while
/ for
?No. All pickle can do is just simply define a variable or call a function, so this kind of syntax wouldn't exist.
But if you want to do things like:
ans = input("Yes/No: ")
if ans == 'Yes':
print("Great!")
elif ans == 'No':
exit()
It's still achievable! You can rewrite your code like this:
from functools import partial
condition = {'Yes': partial(print, 'Great!'), 'No': exit}
ans = input("Yes/No: ")
condition.get(ans, repr)()
ta-da!
For the loop syntax, you can try to use map
/ starmap
/ reduce
etc .
And yes, you are right, it's functional programming time!