import PyPDF2 pdfFileObj = open('sample.pdf', 'rb') pdfReader = PyPDF2.PdfFileReader(pdfFileObj) # Iterate over all pages for pageNum in range(pdfReader.numPages): pageObj = pdfReader.getPage(pageNum) print(pageObj.extractText())
import PyPDF2 pdfFileObj = open('sample.pdf', 'rb') pdfReader = PyPDF2.PdfFileReader(pdfFileObj) print(pdfReader.documentInfo)Here, we are printing out the document information stored in the PDF file. This might include author, title, date, and other metadata. Both examples use the PyPDF2 package, specifically the PdfFileReader class, to accomplish the tasks.