Read And Write PDF Forms From Excel (VBA)

Share this

October 20, 2013

Read And Write PDF Forms From Excel
 

Introduction


 

This post is based on a blog reader request. Mac commented in an old post about a problem he had with a PDF form. He actually wanted to fill the fields of a PDF form using data from an XML file. I had never worked with PDF forms until that time, so it was an excellent opportunity to learn something new. Although this post doesn’t answer Mac’s request directly, it provides some ideas to anyone who wants to exchange data with PDF forms using Excel and Adobe Professional. Moreover, if you have data in an XML file, you can import the data in Excel and then run the macros provided here.

For demonstration purposes, I have created a sample conference PDF form (Test Form). Below you will find VBA code for writing and reading data from Test Form. The WritePDFForms macro uses the sheet Write to fill the Test Form, which is located in the same folder as the sample workbook. The data from each row is used to create a new PDF file, saved in the Forms subfolder. The First Name and Last Name columns in each row are combined to create the PDF file’s name. On the other hand, the ReadPDFForms macro loops through all the files in the specified folder (Forms). If a PDF file is found, the macro opens it, reads specific fields from the file, and writes the values in the “Read” sheet. The code of both macros can be easily adjusted to work with any PDF form, as long as the PDF form’s field names are known.

 


VBA code


Here is the VBA code for WritePDFForms macro.
 
 
 
Option Explicit

Sub WritePDFForms()
      
    '--------------------------------------------------------------------------------------
    'This macro uses the data in sheet Write in order to fill a sample PDF form named
    'Test Form, which is located in the same folder with this workbook. The data from
    'each row is used to create a new PDF file, which is saved in the Forms subfolder.
        
    'The code uses late binding, so no reference to external library is required.
    'However, the code works ONLY with Adobe Professional, so don't try to use it with
    'Adobe Reader because you will get an "ActiveX component can't create object" error.
    
    'Written by:    Christos Samaras
    'Date:          15/10/2013
    'e-mail:        [email protected]
    'site:          https://myengineeringworld.net/////
    '--------------------------------------------------------------------------------------

    'Declaring the necessary variables.
    Dim strPDFPath              As String
    Dim strFieldNames(1 To 11)  As String
    Dim i                       As Long
    Dim j                       As Integer
    Dim LastRow                 As Long
    Dim objAcroApp              As Object
    Dim objAcroAVDoc            As Object
    Dim objAcroPDDoc            As Object
    Dim objJSO                  As Object
    Dim strPDFOutPath           As String
           
    'Disable screen flickering.
    Application.ScreenUpdating = False
    
    'Specify the path of the sample PDF form.
    'Full path example:
    'strPDFPath = "C:UsersChristosDesktopTest Form.pdf"
    'Using workbook path:
    strPDFPath = ThisWorkbook.Path & "" & "Test Form.pdf"
    
    'Set the required field names in the PDF form.
    strFieldNames(1) = "First Name"
    strFieldNames(2) = "Last Name"
    strFieldNames(3) = "Street Address"
    strFieldNames(4) = "City"
    strFieldNames(5) = "State"
    strFieldNames(6) = "Zip Code"
    strFieldNames(7) = "Country"
    strFieldNames(8) = "E-mail"
    strFieldNames(9) = "Phone Number"
    strFieldNames(10) = "Type Of Registration"
    strFieldNames(11) = "Previous Attendee"
    
    'Find the last row of data in sheet Write.
    With shWrite
        .Activate
        LastRow = .Cells(.Rows.Count, "B").End(xlUp).Row
    End With
    
    'Loop through all rows of sheet Write and use the data to fill the PDF form.
    For i = 4 To LastRow
    
        On Error Resume Next
        
        'Initialize Acrobat by creating the App object.
        Set objAcroApp = CreateObject("AcroExch.App")
        
        'Check if the object was created.
        If Err.Number <> 0 Then
            MsgBox "Could not create the App object!", vbCritical, "Object error"
            'Release the object and exit.
            Set objAcroApp = Nothing
            Exit Sub
        End If
        
        'Create the AVDoc object.
        Set objAcroAVDoc = CreateObject("AcroExch.AVDoc")
        
        'Check if the object was created.
        If Err.Number <> 0 Then
            MsgBox "Could not create the AVDoc object!", vbCritical, "Object error"
            'Release the objects and exit.
            Set objAcroAVDoc = Nothing
            Set objAcroApp = Nothing
            Exit Sub
        End If
        
        On Error GoTo 0
        
        'Open the PDF file.
        If objAcroAVDoc.Open(strPDFPath, "") = True Then
            
            'Set the PDDoc object.
            Set objAcroPDDoc = objAcroAVDoc.GetPDDoc
           
            'Set the JS Object - Java Script Object.
            Set objJSO = objAcroPDDoc.GetJSObject
            
            On Error Resume Next
            
            'Fill the form fields.
            For j = 1 To 10
                
                objJSO.GetField(strFieldNames(j)).Value = CStr(shWrite.Cells(i, j + 1).Value)
                
                If Err.Number <> 0 Then
                    
                    'Close the form without saving the changes.
                    objAcroAVDoc.Close True
                    
                    'Close the Acrobat application.
                    objAcroApp.Exit
                    
                    'Inform the user about the error.
                    MsgBox "The field """ & strFieldNames(j) & """ could not be found!", vbCritical, "Field error"
                    
                    'Release the objects and exit.
                    Set objJSO = Nothing
                    Set objAcroPDDoc = Nothing
                    Set objAcroAVDoc = Nothing
                    Set objAcroApp = Nothing
                    Exit Sub
                    
                End If
            Next j
            
            'Fill the checkbox field.
            If shWrite.Cells(i, j + 1).Value = "True" Then
                objJSO.GetField(strFieldNames(11)).Value = "Yes"
            End If
            
            On Error GoTo 0
            
            'Create the output path, which will be like C:UsersChristosDesktopForms�1) First Name Last Name.pdf.
            With shWrite
                If i - 3 < 10 Then
                    strPDFOutPath = ThisWorkbook.Path & "Forms�" & i - 3 & ") " & .Cells(i, 2).Value & " " & .Cells(i, 3).Value & ".pdf"
                Else
                    strPDFOutPath = ThisWorkbook.Path & "Forms" & i - 3 & ") " & .Cells(i, 2).Value & " " & .Cells(i, 3).Value & ".pdf"
                End If
            End With
            
            'Save the form as new PDF file.
            objAcroPDDoc.Save 1, strPDFOutPath
    
            'Close the form without saving the changes.
            objAcroAVDoc.Close True
            
            'Close the Acrobat application.
            objAcroApp.Exit
               
            'Release the objects.
            Set objJSO = Nothing
            Set objAcroPDDoc = Nothing
            Set objAcroAVDoc = Nothing
            Set objAcroApp = Nothing
            
        Else
        
            MsgBox "Could not open the file!", vbCritical, "File error"
            
            'Close the Acrobat application.
            objAcroApp.Exit
            
            'Release the objects and exit.
            Set objAcroAVDoc = Nothing
            Set objAcroApp = Nothing
            Exit Sub
            
        End If
        
    Next i
    
    'Enable the screen.
    Application.ScreenUpdating = True
    
    'Inform the user that forms were filled.
    MsgBox "All forms were created successfully!", vbInformation, "Finished"
    
End Sub
 

The code for ReadPDFForms macro follows.

Option Explicit

Sub ReadPDFForms()
   
    '--------------------------------------------------------------------------------------
    'This macro loops through all the files of the specified folder (Forms). If the file
    'is PDF (PDF form here) the macro opens the file, reads specific fields the file
    'and writes the values in the sheet Read.
        
    'The code uses late binding, so no reference to external library is required.
    'However, the code works ONLY with Adobe Professional, so don't try to use it with
    'Adobe Reader because you will get an "ActiveX component can't create object" error.
    
    'Written by:    Christos Samaras
    'Date:          15/10/2013
    'e-mail:        [email protected]
    'site:          https://myengineeringworld.net/////
    '--------------------------------------------------------------------------------------

    'Declaring the necessary variables.
    Dim strFormsFolder          As String
    Dim strFieldNames(1 To 7)   As String
    Dim objFSO                  As Object
    Dim objSourceFolder         As Object
    Dim objFileItem             As Object
    Dim j                       As Integer
    Dim LastRow                 As Long
    Dim objAcroApp              As Object
    Dim objAcroAVDoc            As Object
    Dim objAcroPDDoc            As Object
    Dim objJSO                  As Object
    Dim strPDFOutPath           As String
           
    'Disable screen flickering.
    Application.ScreenUpdating = False
        
    'Specify the folder that contains the PDF forms.
    'Full path example (note the  at the end):
    'strPDFPath = "C:UsersChristosDesktopForms"
    'Using workbook path:
    strFormsFolder = ThisWorkbook.Path & "Forms"
    
    'Set the required field names in the PDF form.
    strFieldNames(1) = "First Name"
    strFieldNames(2) = "Last Name"
    strFieldNames(3) = "City"
    strFieldNames(4) = "Country"
    strFieldNames(5) = "E-mail"
    strFieldNames(6) = "Type Of Registration"
    strFieldNames(7) = "Previous Attendee"
    
    On Error Resume Next
    
    'Create the File System object.
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    
    'Check if the object was created.
    If Err.Number <> 0 Then
        MsgBox "Could not create the File System object!", vbCritical, "Object error"
        'Release the object and exit.
        Set objFSO = Nothing
        Exit Sub
    End If
    On Error GoTo 0
                    
    'Get information about the Forms folder.
    Set objSourceFolder = objFSO.GetFolder(strFormsFolder)
    
    'Loop through all the files found in the folder Forms.
    For Each objFileItem In objSourceFolder.Files
        
        'Check if the file is pdf.
        If LCase(Right(objFileItem.Path, 3)) = "pdf" Then
            
            'Find the last row of data in sheet Read.
            With shRead
                .Activate
                LastRow = .Cells(.Rows.Count, "B").End(xlUp).Row
            End With
        
            On Error Resume Next
            
            'Initialize Acrobat by creating the App object.
            Set objAcroApp = CreateObject("AcroExch.App")
            
            'Check if the object was created.
            If Err.Number <> 0 Then
                MsgBox "Could not create the App object!", vbCritical, "Object error"
                'Release the objects and exit.
                Set objAcroApp = Nothing
                Set objFileItem = Nothing
                Set objSourceFolder = Nothing
                Set objFSO = Nothing
                Exit Sub
            End If
            
            'Create the AVDoc object.
            Set objAcroAVDoc = CreateObject("AcroExch.AVDoc")
            
            'Check if the object was created.
            If Err.Number <> 0 Then
                MsgBox "Could not create the AVDoc object!", vbCritical, "Object error"
                'Release the objects and exit.
                Set objAcroAVDoc = Nothing
                Set objAcroApp = Nothing
                Set objFileItem = Nothing
                Set objSourceFolder = Nothing
                Set objFSO = Nothing
                Exit Sub
            End If
            
            On Error GoTo 0
        
            'Open the PDF file.
            If objAcroAVDoc.Open(objFileItem.Path, "") = True Then
            
                'Set the PDDoc object.
                Set objAcroPDDoc = objAcroAVDoc.GetPDDoc
               
                'Set the JS Object - Java Script Object.
                Set objJSO = objAcroPDDoc.GetJSObject
                
                'Create a counter in sheet Read (counts each PDF file).
                shRead.Cells(LastRow + 1, 1).Value = LastRow - 2
                
                On Error Resume Next
                
                'Read the form fields.
                For j = 1 To 6
                    
                    shRead.Cells(LastRow + 1, j + 1).Value = objJSO.GetField(strFieldNames(j)).Value
                    
                    If Err.Number <> 0 Then
                        
                        'Close the form without saving the changes.
                        objAcroAVDoc.Close True
                        
                        'Close the Acrobat application.
                        objAcroApp.Exit
                        
                        'Inform the user about the error.
                        MsgBox "The field """ & strFieldNames(j) & """ could not be found!", vbCritical, "Field error"
                        
                        'Release the objects and exit.
                        Set objJSO = Nothing
                        Set objAcroPDDoc = Nothing
                        Set objAcroAVDoc = Nothing
                        Set objAcroApp = Nothing
                        Set objFileItem = Nothing
                        Set objSourceFolder = Nothing
                        Set objFSO = Nothing
                        Exit Sub
                        
                    End If
                Next j
                
                'Read the checkbox field.
                If objJSO.GetField(strFieldNames(7)).Value = "Yes" Then
                    shRead.Cells(LastRow + 1, j + 1).Value = "True"
                Else
                    shRead.Cells(LastRow + 1, j + 1).Value = "False"
                End If
                
                On Error GoTo 0
                
                'Close the PDF file without saving the changes.
                objAcroAVDoc.Close True
                
                'Close the Acrobat application.
                objAcroApp.Exit
               
                'Release the objects.
                Set objJSO = Nothing
                Set objAcroPDDoc = Nothing
                Set objAcroAVDoc = Nothing
                Set objAcroApp = Nothing
    
            Else
            
                MsgBox "Could not open the file!", vbCritical, "File error"
                
                'Close the Acrobat application.
                objAcroApp.Exit
            
                'Release the objects abd exit.
                Set objAcroAVDoc = Nothing
                Set objAcroApp = Nothing
                Set objFileItem = Nothing
                Set objSourceFolder = Nothing
                Set objFSO = Nothing
                
            End If
        
        End If
        
        'Renew the last row value.
        LastRow = LastRow + 1
            
    Next objFileItem
    
    'Release the objects.
    Set objFileItem = Nothing
    Set objSourceFolder = Nothing
    Set objFSO = Nothing
    
    'Adjust the columns' width.
    shRead.Columns("A:H").AutoFit
    
    'Enable the screen.
    Application.ScreenUpdating = True
    
    'Inform the user that the data acquisition finished.
    MsgBox "All data from the forms were retrieved successfully!", vbInformation, "Finished"
    
End Sub

How to find the field names of a PDF form?


 

Edit PDF Form

Open the PDF form using Adobe Professional and follow the next steps:

  1. Press the Tools button to show the Tools panel.
  2. Select the Forms Tab.
  3. Press the Edit button.

Instead of these three steps, you can press CTRL + SHIFT + 7.

PDF Field Names

The field names will be shown (highlighted in black color), so you can edit them according to your needs. If you need to retrieve the form fields’ names programmatically, you can use the following VBA code.

Option Explicit

Sub RetrievePDFFormNames()
      
    '--------------------------------------------------------------------------------------
    'This macro opens a PDF form, loops through all of its fields and prints their names
    'in the immediate window of VBA editor (in order to be further processed).
        
    'The code uses late binding, so no reference to external library is required.
    'However, the code works ONLY with Adobe Professional, so don't try to use it with
    'Adobe Reader because you will get an "ActiveX component can't create object" error.
    
    'Written by:    Christos Samaras
    'Date:          15/10/2013
    'e-mail:        [email protected]
    'site:          https://myengineeringworld.net/////
    '--------------------------------------------------------------------------------------

    'Declaring the necessary variables.
    Dim strPDFPath              As String
    Dim objAcroApp              As Object
    Dim objAcroAVDoc            As Object
    Dim objAcroForm             As Object
    Dim objobjFields            As Object
    Dim objField                As Object
    
    'Specify the path of the sample PDF form.
    'Full path example:
    'strPDFPath = "C:UsersChristosDesktopTest Form.pdf"
    'Using workbook path:
    strPDFPath = ThisWorkbook.Path & "" & "Test Form.pdf"
        
    On Error Resume Next
    
    'Initialize Acrobat by creating the App object.
    Set objAcroApp = CreateObject("AcroExch.App")
    
    'Check if the object was created.
    If Err.Number <> 0 Then
        MsgBox "Could not create the App object!", vbCritical, "Object error"
        'Release the object and exit.
        Set objAcroApp = Nothing
        Exit Sub
    End If
    
    'Create the AVDoc object.
    Set objAcroAVDoc = CreateObject("AcroExch.AVDoc")
    
    'Check if the object was created.
    If Err.Number <> 0 Then
        MsgBox "Could not create the AVDoc object!", vbCritical, "Object error"
        'Release the objects and exit.
        Set objAcroAVDoc = Nothing
        Set objAcroApp = Nothing
        Exit Sub
    End If
    
    On Error GoTo 0
        
    'Open the PDF file.
    If objAcroAVDoc.Open(strPDFPath, "") = True Then
         
        'Show the application.
        objAcroApp.Show
        
        On Error Resume Next
        
        'Create the Acrobat Form object.
        Set objAcroForm = CreateObject("AFormAut.App")
        
        'Check if the object was created.
        If Err.Number <> 0 Then
            MsgBox "Could not create the AFormAutobject!", vbCritical, "Object error"
            'Release the objects and exit.
            Set objAcroAVDoc = Nothing
            Set objAcroApp = Nothing
            Exit Sub
        End If
        
        On Error GoTo 0
        
        'Get the fields of the form.
        Set objobjFields = objAcroForm.Fields
        
        'Print the number of form fields.
        Debug.Print "Total form Fields = " & objobjFields.Count
        
        'Print the name of the fields and their values (if there are any).
        For Each objField In objobjFields
            Debug.Print objField.Name & " = " & objField.Value
        Next objField
        
    Else
        
        Debug.Print "Could not open the file!"
        
        'Close the Acrobat application.
        objAcroApp.Exit
        
        'Release the objects and exit.
        Set objAcroAVDoc = Nothing
        Set objAcroApp = Nothing
        Exit Sub
            
    End If
    
    'Close the form without saving the changes.
    objAcroAVDoc.Close True
    
    'Close the Acrobat application.
    objAcroApp.Exit
       
    'Release the objects.
    Set objField = Nothing
    Set objobjFields = Nothing
    Set objAcroForm = Nothing
    Set objAcroAVDoc = Nothing
    Set objAcroApp = Nothing
    
End Sub

 

PDF Field Names - Immediate Window

The results will appear in the immediate window of the VBA editor. Then you can copy and use them as you need.

 


Demonstration video


The short video below demonstrates the WritePDFForms and ReadPDFForms macros in action.


Download it from here


 

Download

The zip file contains an Excel workbook that contains the three macros presented above, the sample PDF form, and an empty folder (Forms). The workbook can be opened with Excel 2007 or newer.

Page last modified: July 25, 2021

Christos Samaras

Hi, I am Christos, a Mechanical Engineer by profession (Ph.D.) and a Software Developer by obsession (10+ years of experience)! I founded this site back in 2011 intending to provide solutions to various engineering and programming problems.

  • Hi, Bryan,

    Thank you for your kind words.
    Regarding your question, you will probably have to download the file on your local computer and then try the code.
    Here is a link that shows how to download files with VBA.

    Best Regards,
    Christos

  • Hi Christos,
    Lovely coding. Thanks for sharing. It works perfectly if the files are on my computer. I want for this to work from a Sharepoint drive in the cloud and I get an error after pressing the magic button “Fill Data In Forms”. An Adobe Acrobat Pro error pops up with the following: “There was an error opening this document. The filename, directory name, or volume label syntax is incorrect.” After pressing okay, a File error screen pops up with “Could not open the file!” Any help would be greatly appreciated.

  • Eileen,

    I haven’t tried the code with digitally signed PDF files.
    Nevertheless, I don’t see why the signature will affect the extraction process.

    Best Regards,
    Christos

  • Hi Christos, thanks! May I also check if the PDF file contains a digital signature, will it affect the macro extracting the PDF fields?

  • Eileen,

    The WritePDFForms macro, as it is, will throw that error whenever the PDF field is not found.
    So, this means that the first field you enter probably does not exist in the PDF form.

    Best Regards,
    Christos

  • Hi Christos,

    The code works great. However, I realised that when the PDF file’s first text field is empty (i.e. no text filled up), it shows an error “The field xxx could not be found” and the values in that PDF form does not write to the file. Any idea why is this so and any idea to resolve this problem (i.e. when first text field is empty, continue writing the values in the PDF form to excel with the first text field value as empty value)?

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
    Add Content Block
    >