Sunday, February 21, 2010

smali/baksmali v1.2 released

After lots of hard work over the last month or two, smali/baksmali 1.2 is out!

The major new functionality in this release is that baksmali now supports deodexing without the help of deodexerant! It also has a new "register info" feature, to show the register types in the disassembly, and numerous minor fixes/changes/enhancements/tweaks/(and probably bugs)


Deodexing


In order to deodex files now, you need to have the boot class path files available for baksmali to use. By default, it looks for the 5 main framework jars in the current directory. You can of course specify additional directories to search in, add additional boot class path files, or change which boot class path files are used altogether.

The DeodexInstructions page has more info on how to deodex with this version. But for a quick primer, you just need to have the 5 main framework files in the current directory (core.jar, ext.jar, framework.jar, android.policy.jar and services.jar), and then specify the -x option for baksmali. For example:
baksmali -x Calculator.odex -o Calculator

Register Info


Another bit of new functionality that can be very helpful is the new "register info" output for baksmali, which can be turned on with the -r parameter. It will analyze the registers and print some register type info before and after each instruction. There are several levels of register info output available, depending on exactly what you want to see. The default is to print the register type for any register that is used by the instruction.

Note that this functionality also requires that baksmali load the boot class path files - so they must be available. Here is an example of what the default register info looks like:
#v0=(Integer);v2=(Integer);
new-array v2, v0, [C
#v2=(Reference,[C);

The register types that are printed just before the instruction are the incoming register types, while the register types that are printed after the instruction show any changes to the registers caused by the instruction.

If you want to see all the register info, you can use -r ALL,FULLMERGE which looks something like this:
#v0=(Integer):merge{0x18:(Null),0x2c:(Integer)}
#v1=(Conflicted):merge{0x18:(Uninit),0x2c:(Integer)}
#v3=(Conflicted):merge{0x18:(Uninit),0x2c:(Char)}
#v2=(Reference,[C);p0=(Reference,Ljava/lang/String;);p1=(Reference,[B);p2=(Integer);p3=(Integer);p4=(Integer);
iget v2, p0, Ljava/lang/String;->count:I
#v0=(Integer);v1=(Conflicted);v2=(Integer);v3=(Conflicted);p0=(Reference,Ljava/lang/String;);p1=(Reference,[B);p2=(Integer);p3=(Integer);p4=(Integer);

Other changes


There are a few other miscellaneous changes as well. Make sure you take a look at the usage info for smali and baksmali. The short parameters for some of the options have changed. In particular some of the options that are mostly for debugging purposes were changed to an uppercase letter, and are now hidden by default. You can use -?? for both smali and baksmali to see the debug options.

baksmali also has a new -f parameter, which adds a comment with the code address before each instruction. This is useful when looking at the FULLMERGE register info, which shows the register info and code addresses for all "incoming" execution paths.

Things to come


With this release, I have added a robust code analyzer/verifier that can infer the register types and validate the instructions. I plan using this to add verification functionality in smali, so that it will optionally verify the code after assembling it. This will let you know there's a problem with the assembled code without having to push the code to a device and have dalvik complain to you about the invalid code.

I also want to add some way to dump/serialize the results of loading the boot class path files for baksmali, so that it can load the information it needs from the dump file, instead of reading in all 5 boot class path files every time, which should help speed it up.

In the longer term, I would love to be able to debug code on a device at an assembly level. This is just something that is banging around in the back of my head for now.

7 comments:

  1. Social comments and analytics for this post...

    This post was mentioned on Twitter by cyanogen: http://jf.andblogs.net/2010/02/22/smali-baksmali-1-2-released/...

    ReplyDelete
  2. Excellent work jf! Been waiting for this for a while now.
    Waiting now for u to reach the ultimate goal (disasembling the whole system including libs) ;)

    ReplyDelete
  3. Thank you very much.

    Hey Where are you still missing the Old G1 days with RC29 Update.

    ReplyDelete
  4. Just a quick thanks! The Android scene wouldn't be the same without you or your tools. I've used them a lot and really love them. Debugging on assembly level would be a dream come true, but seriously this is already insanely good!

    ReplyDelete
  5. Did I mention you're out of your mind? :-)

    Something to think about:

    There's a value in the dependency area of a .odex file that has the value of the DALVIK_VM_BUILD constant (look for it in the sources to see where it's placed). It's updated whenever something that affects the validity of .odex files changes, like the inline method table, field layout, vtable ordering, etc. You may want to pull that out and make it available.

    I mention this because the contents of gDvmInlineOpsTable are going to change in the next release, and there's really no other way to identify what was used. (The DALVIK_VM_BUILD is not part of the version in the header because changing the inline methods doesn't change the structure or interpretation of the DEX file itself, e.g. dexdump continues to work as before.)

    The fact that eclair simply extended donut is a trend that's about to break. (Sorry.) FWIW, donut version=14, eclair=17, future is not yet written.

    ReplyDelete
  6. re: out of my mind. Yes, I think you might have mentioned that once or twice :)

    I wasn't aware of the DALVIK_VM_BUILD property. That will certainly be helpful to help distinguish what inline methods should be used.

    At some point, I also want to read the dependencies from that area and automatically try to load them, instead of having to specify them manually via a command line parameter.

    I figured things would change at some point - and I that I would have to deal with them when they did. I decided to go this route rather than the native binary mainly for ease of use. It was a pain to have to run the deodexerant binary on the phone.

    Hopefully the vtable indexes and field offsets won't change as well. But if they do, I'm sure I'll figure out how to handle that also.

    Thanks for the heads up and the info! Much appreciated

    ReplyDelete

Note: Only a member of this blog may post a comment.