'Chuck' StevensAlma mater,Scientific careerFieldsInstitutions,Notable studentsInfluencesCharles F. 'Chuck' Stevens (born 1934) is an American at the in La Jolla.He is currently the Vincent J. Coates Professor at the and adjunct professor of pharmacology and neuroscience at 's School of Medicine. He is also an external professor at the and a general member of the. Major contributions He made several seminal discoveries regarding the molecular basis of synaptic transmission.
In 2002, together with, Stevens described the '3/5 Power Scaling law of neural circuits.' Stevens and Anderson used noise analysis to infer the conductance of single acetylcholine ion channels. This work paved the way for Nobel laureate 's patch clamping techniques. Neher was a postdoctoral associate with Stevens at the and then.
Education Stevens has a B.A. In psychology from, where he began his education hoping to be a physician. He then received an M.D. Degree at, and a Ph.D. In biophysics from with. He was a member of the faculties at the Medical School and at before joining the Salk Institute.Stevens was elected member to the in 1982, and he was formerly an investigator of the.
He was elected a Fellow of the in 1984. In 2000 he was awarded the from the. References.; Stevens, C. 'Obituary: (1916–2004)'. 430 (7002): 845–847. Archived from on 2010-04-28. Retrieved 2010-02-10.
CS1 maint: archived copy as title. Archived from on 2008-07-24. Retrieved 2010-02-10. CS1 maint: archived copy as title. (PDF). Archived from (PDF) on 2006-10-12.
Retrieved 2014-01-07. CS1 maint: archived copy as title. (PDF).
American Academy of Arts and Sciences. Retrieved 7 April 2011. National Academy of Sciences. Archived from on 2011-03-18. Retrieved 27 February 2011.
I produced for my pdfid and pdf-parser tools, you can find them on.There are translations of this page, see.pdf-parser.pyThis tool will parse a PDF document to identify the used in the analyzed file. It will not render a PDF document.
The code of the parser is quick-and-dirty, I’m not recommending this as text book case for PDF parsers, but it gets the job done.You can see the parser in action in.The stats option display statistics of the objects found in the PDF document. Use this to identify PDF documents with unusual/unexpected objects, or to classify PDF documents.
For example, I generated statistics for 2 malicious PDF files, and although they were very different in content and size, the statistics were identical, proving that they used the same attack vector and shared the same origin.The search option searches for a string in indirect objects (not inside the stream of indirect objects). The search is not case-sensitive, and is susceptible to the (as I’ve yet to encounter these obfuscation techniques in the wild, I decided no to resort to canonicalization).filter option applies the filter(s) to the stream. For the moment, only FlateDecode is supported (e.g. Zlib decompression).The raw option makes pdf-parser output raw data (e.g. Not the printable Python representation).objects outputs the data of the indirect object which ID was specified.
This ID is not version dependent. If more than one object have the same ID (disregarding the version), all these objects will be outputted.reference allows you to select all objects referencing the specified indirect object. This ID is not version dependent.type allows you to select all objects of a given type.
The type is a Name and as such is case-sensitive and must start with a slash-character (/).MD5: 7EB1713631D255B36BC698CD2422C7EBSHA256: D4D5AC9C26A9D8FEF65CE58A769D3F64A737860DC26606068CCDD3F04FDEA0D7make-pdf toolsmake-pdf-javascript.py allows one to create a simple PDF document with embedded JavaScript that will execute upon opening of the PDF document. Stargate atlantis season 5. I’d like to be able to view a scanned pdf file (with handwriting in some fields) and blackout boxes on the form whose fields contain info I don’t want published.Can that kind of thing be automated in a batch so that I don’t even have to open the files?That would be cool Can you point me in the right direction?
I’m not looking for you to code, but sending me in the right direction for this would be useful, and it looks like you’re cognizant of this kind of information.Comment by james — Friday 21 November 2008 @. Hello — I am using pdf-parser and python for the first time so please excuse my ignorance.I’m using Python 3.0.1 on Windows XP. I’ve copied the pdf-parser.py file into the C:Python30 directory which contains the python executable.
Below is the error I get when attempting to execute your utility:C:Python30python.exe pdf-parser.pyFile “pdf-parser.py”, line 180print ‘todo 1:%s’% (self.token1 + self.token21)^SyntaxError: invalid syntax————Any ideas? Thanks for your time.Comment by Mike — Tuesday 5 May 2009 @. Hello — I’m now using Python 2.6.2. It appears to be working, however I am getting so much output from every pdf I examine, I wonder if I am doing something wrong.My syntax ispdf-parser.py –search javascript malware.pdfThe utility spits out hundreds maybe thousands of lines of returned information. At the very top there appears to be useful data, however there are hundreds of lines that look like:todo 10: 3 ‘X1x1ex1bx03x12x05X60BAm I doing something incorrectly here or is there a way to filter the rest of this data out?Thanks for your help.PS, I watched your video on pdf-parser and it doesn’t have any audio.Comment by Mike — Thursday 7 May 2009 @.
Im getting errors when running the python script:C:Documents and SettingsyoDesktopToolspdfpdf-parser.pyFile “C:Documents and SettingsyoDesktopToolspdfpdf-parser.py”, line 198print ‘todo 1:%s’% (self.token1 + self.token21)^SyntaxError: invalid syntaxIm getting errors when trying to run the script. Im using activepyton 3.1 on windows xp. Launching it from the commandline. Was there any recent modifications which broke the script?ThanksComment by Dave — Friday 17 July 2009 @. Hey Didier,Thanks for excellent tool and great PDF analysis blog.
I enjoyed every minute and in addition I have become much more paranoid when it comes to carelessly downloading tons of PDF material. Now I run all my PDFs through your “pdfid” tool, if I have downloaded anything from a suspicious siteBut I can’t help thinking that this should be implemented as an automatic plug-in/add-on to Firefox? You know, when you click on PDFs, they usually automatically open in the browser, which is nice if it was safe. But in the cyber-war era of today it is simply very bad, at it’s best!Comment by E:V:A — Saturday 16 January 2010 @. Is the File Size limited? Everytime i scan larger PDF files i get exceptions like this:.Error occured.Traceback (most recent call last):File “C:PDFtoolspdfid.py”, line 363, in PDFiD(bytesHeader, pdfHeader) = FindPDFHeaderRelaxed(oBinaryFile)File “C:PDFtoolspdfid.py”, line 218, in FindPDFHeaderRelaxedbytes = oBinaryFile.bytes(1024)File “C:PDFtoolspdfid.py”, line 70, in bytesinbytes = self.infile.read(size – len(self.ungetted))IOError: Errno 9 Bad file descriptorComment by sheldor — Monday 8 February 2010 @.
With PDFiD, I’ve noticed I get a lot of false positives on the /JS and /AA tags, since in most cases (that I’ve looked at) they seem to be simply text in a compressed image or something similar. I haven’t seen a /JS used on it’s own for Javascript, but it does seem that if there is a /JS then there is also a /S/JavaScript to go with it.Is this always the case, or just in the samples I’ve looked at so far (same applies for AA)? Finding the text JavaScript is much less likely to lead to a false positive than JS.Comment by Russell — Monday 28 June 2010 @. Possible bug: PDFiD fails sometime in cPDFEOF when using –extra option for entropy, stating cntCharsAfterLastEOF doesn’t exist.
Defining it in init seems to fix the issue.Other Notes: Is it possible to use pdf-parser to parse pdf-parser output? For example, I can see a use of this when using pdf-parser to obtain contents of object streams, but then it would be nice if it were possible to use pdf-parser on THAT output to display all Launch commands, for example (similar to piping into PDFiD, but actually seeing the contents instead of just the count).
Then again, object stream structure is a bit different so perhaps that’s why it doesn’t play nice. I haven’t figured it out yetComment by Russell — Thursday 15 July 2010 @. Didier Stevens’ PDF tools Over the weekend, I was reading Didier Stevens’ chapter on malicious PDF analysis and I have to give credit to him to break down the technical part of a PDF into something simple and easy to understand (er maybe I am the only one who is coming to term with PDF for the first time). Reading the article brought me to his PDF-tools.
Pdfid and pdf-parser is definitely a must try if you really want to get your hands-on on PDF analysis. Pingback by — Sunday 3 October 2010 @. A couple of observations about pdf-parser.py.First, it is very slow on files which have large images embedded in them. I think this comes from the tokenizer code which contains lines such asself.token = self.token + chr(self.byte)There is a good analysis of the speed of this compared to other methods at. When I changed it so that self.token is a StringIO buffer, I got an huge increase in speed. In particular, one file which has not completed parsing after 30 minutes was now processed in a few seconds.Secondly, I noticed that Decompress was not called on some stream data.
This turned out to be because the stream was ASCII85 encoded and ended like this:T.5.QV#Ts4IendstreamNote that there is no end of line character between the ASCII85 end marker and the endstream keyword. According to the PDF 1.7 specification, this is not approved of but is allowed:“It is recommended that there be an end-of-line marker after the data and before endstream; this marker is not included in the stream length.” on page 61 of. The file in question was generated by reportlab.My first thought for fixing this was to changeif self.contenti0 CHARREGULAR and self.contenti1 ‘endstream’:toif self.contenti0 CHARREGULAR and self.contenti1.endswith(‘endstream’)and then trimming the keyword off the data. However, this does not work, as self.contenti1 actually ends with a newline character, and self.contenti0 has the value CHARDELIMITER. Something likeif self.contenti1.strip.endswith(‘endstream’):end = self.contenti1.rindex(‘endstream’)data += self.contenti1:endmight do the job, though it’s ugly. The ideal solution would really be to use the length attribute from the dictionary, though this seems to be a bigger change.Otherwise, the code looks great, and is really helping me with a project I am working on.Comment by David Elworthy — Tuesday 1 January 2013 @.
My end goal is writing a scanner application which will build archive versions of documents from photographs of pages. I’m a long way off this, and so was using a PDF build from some photos of landscapes, but even so the files were only a few megabytes. Eventually I want to generate my own PDFs, as I don’t much like reportlab and pyPDF, but for now reportlab is what I am using.
I was looking at your code as a way of understanding the file format. As a shorter term project, I also want to write something which willtake files with 600 dpi images from a flatbed scanner and either downsample them to a lower dpi or increase the JPEG compression, as I sometimes find the 600 dpi scans (which are meant to be archive quality) are a bit large for emailing when there’s a lot of pages.
Of course there are plenty of applications which allow you to manipulate PDFs interactively, but I’m a command line kind of guy, so a python script would be ideal.Comment by David Elworthy — Tuesday 1 January 2013 @. @Phil OK, I was sure you used option -a. You have to know that PDF readers like Adobe Reader do not allow you to extract executable files. To determine if a file is executable or not, Adobe Reader looks at the extension. So you can’t extract.exe files (unless you change the extension to something that is not executable, like.txt).Option -a instructs my tool to add JavaScript to the PDF document to extract the embedded file automatically. But since this is not allowed for an.exe file, the script fails, and that is what you see in the error messages.FYI because you are doing this on OSX: Python (.py) is allowed as executable file type.Comment by — Wednesday 13 March 2013 @.
Suspected that much. Was able to unpackage Acrobat to determine the list of disallowed extensions. @Didier: Great! Sure enough there were some missing stuff in there, but there were6 counts of “EOF%%” but I can only tell any obvious difference between the1st and 2nd versions. The later ones “look” the same.
Pdf Forensics
I wish there weresome kind of more visual PDF diffing utilityBTW. I got the offsets by:strings -n 4 -t x -e s weird.pdf grep -i -E “%%EOF”Then extracted the versions with:dd if=weird.pdf of=weirda.pdf bs= count=1Thanks again.PS. Sorry, I can’t attend your training as I live very far away.Comment by CurlyBird — Thursday 5 September 2013 @. Hello Didier.I don’t know anything in Python and in pdf securityToday i downloaded a pdf file.
When i opened it, “cmd” appeared and many lines displayed quickly insideSoi have to check this strange pdf fileSo i installed Python 3.3.2 (my pc’s OS is Windows 7) but i have a problem when i test:C:Python33python.exe pdf-parser.pyFile “pdf-parser.py”, line 486except zlib.error, e:^SyntaxError: invalid syntaxThanks for your help. It’s very importan and urgent for me to check this file and to know if my pc has a problem.MathiasComment by Mathias Rollet — Saturday 7 September 2013 @. Hi Didier,I find the PFDiD.py interesting but I am having a difficult time trying to get it to work.
I have a VM with Windows XP installed and python 2.6.6 installed and appears to be working fine. If I enter PDFiD MyFile.pdf I get a syntax error. Although your example shows PDFiD 0.0.2 test.pdf I just don’t understand what significance the number has and what the correct syntax is to get my example to work. Another example I’m having difficulties with is pdf-parser.py that should work by implementing pdf-parser.py MyFile.pdf –search=javascript. Any help would be greatly appreciatedThanks,MichaelComment by Michael — Sunday 26 January 2014 @.
Didier,Thanks for getting back to me so soon! Below will be the syntax errors:Example 1: PDFiDPython 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) MSC v.1500 32 bit (Intel) onwin32Type “help”, “copyright”, “credits” or “license” for more information. PDFiD TheFlyv3EN4Rdr.pdfFile “”, line 1PDFiD TheFlyv3EN4Rdr.pdf^SyntaxError: invalid syntaxExample 2: pdf-parser.py pdf-parser.py TheFlyv3EN4Rdr.pdf –search=javascriptFile “”, line 1pdf-parser.py TheFlyv3EN4Rdr.pdf –search=javascript^SyntaxError: invalid syntaxThanks,MichaelComment by Michael — Sunday 26 January 2014 @.
Didier, I need some help. I have recently upgraded to Adobe XI and some of my previously OK adobe pdf files now cannot be read. Adobe indicate that they have increased “security” and enforced some compliance in their document headers. I have gone back though my useful bits of software that may be able to help me and come across your 010 editor. I have compared a couple of files that are ok and still working but I am not able to spot any significant differences (other than length, width height etc.).Have you com across this problem before?
Any ideas as to how to resolve the problem. BTW the file is quite large 25MB and my programming skills are now quite poor – used to be an assembler / c programmer about 20 years ago!!!Comment by Paul Kirikal — Sunday 24 August 2014 @. StevenI’m trying to use pdfid in windows 8 with python 2.7.8 when I ran it by cmd.exe I have this error:C:UsersTestDesktoppdfid.py MultiplePages.pdfTraceback (most recent call last):File “C:UsersTestDesktoppdfid.py”, line 25, inimport urllib.requestFile “C:Python27liburllib.py”, line 33, infrom urlparse import urljoin as basejoinFile “C:Python27liburlparse.py”, line 119, infrom collections import namedtupleFile “C:Python27libcollections.py”, line 12, inimport heapq as heapqImportError: No module named heapqcould you help me pleaseComment by Suleiman Khitan — Friday 3 October 2014 @. Regarding to the 232 question i have this errorC:UsersTestDesktoppdfid.py MultiplePages.pdfTraceback (most recent call last):File “C:UsersTestDesktoppdfid.py”, line 20, inimport zipfileFile “C:Python27libzipfile.py”, line 4, inimport struct, os, time, sys, shutilFile “C:Python27libshutil.py”, line 12, inimport collectionsFile “C:Python27libcollections.py”, line 12, inimport heapq as heapqImportError: No module named heapqComment by Suleiman Khitan — Friday 3 October 2014 @.
Got an error:C:UsersrootDownloadspython pdf-parser.py -w pagrindinisbrezinys.pdfPDF Comment%PDF-1.5Traceback (most recent call last):File “pdf-parser.py”, line 1201, inMainFile “pdf-parser.py”, line 1094, in Mainprint(‘PDF Comment%s’% FormatOutput(object.comment, options.raw))File “C:Python33libencodingscp775.py”, line 19, in encodereturn codecs.charmapencode(input,self.errors,encodingmap)0UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position 13-15: character maps toComment by Donatas — Thursday 19 March 2015 @. Hi Didier,I recently received a PDF document that I have attempted to analyze using your tools.
When opened, it was obvious that the document has a link to a credential harvesting site that it tempts users to click on. However, using pdf-parser, I am unable to locate the URI object. I attempted to decompress the 4 object streams but received errors relating to unexpected compression method. I then attempted to follow the method you posted regarding the handling of special PDF compression methods but also to no avail.
Is this a new technique or is there something I have missed? Of note, there appears to be some form of DRM/encryption also applied as there are also 2 /Encrypt objects. I have uploaded the file to VT (SHA 256: 7d2b615630efd2fa3713d97e57afb9972f43e7d4a67cc706af7c789dd1dbe47f) if you are interested in taking a look.TomComment by Tom — Thursday 5 January 2017 @. Hi Didier,I’m wondering if you’ve released any new versions of pdfid.py and pdf-parser.py? Kali folks just released, for free, “Kali linux revealed” and I wanted to take a look, however, pdfid.py hangs while trying to analyze this file. Hey Didier,Great work, BTW.
I had a suggestion for what I think would be a useful feature for pdfid. In addition to the strings you’re currently counting, also count “/URI (http”. I think that all of the malicious PDF files I’ve seen for the last couple of years have just been vehicles to get malicious links past email filtering. It would be useful as well, to actually parse out the links, as pdf-parser does, but that’s probably beyond your intended scope for this tool. Another possible alternate way to do this would be to count only ‘suspicious’ http URI values, such as those using bare IP addresses, shortened URLs, or other criteria.ThanksJohn McCashComment by — Thursday 21 September 2017 @.
We are using your tool as standalone tool among many other tools to make analysis for sample file(s), and produce output data from them. This is implemented by building CI/CD pipeline which finally generates results combining result data of different tools.
In practice, we are using your tool from command line, and we can’t import it as Python library. It would be nice to be able to produce json/xml output with command line arguments from the official version of the pdfid.If you are interested to see pdfid’s role, more can be seen hereWe are using triage plugin.Comment by — Friday 31 May 2019 @.