Skip to content Skip to sidebar Skip to footer

Compare Whether Two Python Files Result In Same Byte Code (are Code Wise Identical)

We're doing some code cleanup. The cleanup is only about formatting (if an issue, then let's even assume, that line numbers don't change, though ideally I'd like to ignore also lin

Solution 1:

You might try using Python's internal compile function, which can compile from string (read in from a file in your case). For example, compiling and comparing the resulting code objects from two equivalent programs and one almost equivalent program and then just for demo purposes (something you would not want to do) executing a couple of the code objects:

import hashlib
import marshal
​
​
def compute_hash(code):
    code_bytes = marshal.dumps(code)
    code_hash = hashlib.sha1(code_bytes).hexdigest()
    return code_hash
​
​
source1 = """x = 3
y = 4
z = x * y
print(z)
"""
source2 = "x=3;y=4;z=x*y;print(z)"
​
source3 = "a=3;y=4;z=a*y;print(z)"
​
obj1 = compile(source=source1, filename='<string>', mode='exec', dont_inherit=1)
obj2 = compile(source=source2, filename='<string>', mode='exec', dont_inherit=1)
obj3 = compile(source=source3, filename='<string>', mode='exec', dont_inherit=1)
​
print(obj1 == obj2)
print(obj1 == obj3)
​
exec(obj1)
exec(obj3)
print(compute_hash(obj1))

Prints:

True
False
12
12
48632a1b64357e9d09d19e765d3dc6863ee67ab9

This will save you from having to copying py files, creating pyc files, comparing pyc files, etc.

Note:

The compute_hash function is if you need a hash function that is repeatable, i.e. returns the same value repeatedly for the same code object when computed in successive program runs.


Solution 2:

Might be not the desired answer - but why dont you use a diff tool to compare the if the files are changed? https://linuxhandbook.com/diff-command/

And if the files are changed use a mergetool like meld to compare the changes http://meldmerge.org/


Post a Comment for "Compare Whether Two Python Files Result In Same Byte Code (are Code Wise Identical)"