본문 바로가기

.NET/Debugging

Know Program Database file (PDB)

http://www.codeproject.com/Articles/349076/Know-Program-Database-file-PDB


Introduction 

This article helps developer who are beginner or intermediate level but don't have much understanding about the importance of PDBs and why they are required.

What is PDB

PDB is acronymn for Program database file

A PDB file is typically created from source files during compilation. It stores a list of all symbols in a module with their addresses and possibly the name of the file and the line on which the symbol was declared.  (from wiki)

Why PDB as a seperate file? 

These symbols can be very well embedded in the binary but it in turn result in the file size growing significantly larger (sometimes by several megabytes). To avoid this extra size, modern compilers and early mainframe debugging systems output the symbolic information into a separate file; for Microsoft compilers, this file is called a  PDB file

What PDB file Contains:

Following are some of the important information stored by PDB file

1. Local variable name- To prove that pdb contain local variable name, we will make use of Reflector to decompile the assembly with its PDB present in same folder as that of assembly. Reflector has a option called "Show PDB symbols" as shown in the screenshot which when checked also loads corresponding PDB for that assembly. When you check the option you can see the decompiled code has same variable name as that of your actual code but in absence of that PDB or when this option is unchecked your local variables in your decompiled code would get replaced with names like "str" for string variable and "num" for decimal etc.

 

2. Source file name 

3. Line no. of the source

4. Source Indexing (Explained in later section)

To show that PDB contains Source file name and line no. of shource (point 2 and 3) run following console application first with PDB present in same folder and second by deleting the PDB file.

namespace UnderstandingPDBs
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                int sum = Add(5, 10);
                decimal value = Divide(10, 0);
            }
 
            catch
            {
            }
        }
 
        private static int Add(int i, int j)
        {
            return i + j;
        }
 
        private static decimal Divide(int i, int j)
        {
            try
            {
                return i / j;
            }
            catch (Exception ex)
            {
                LogError(ex);
                throw ex;
            }
          
        }
 
        private static void LogError(Exception ex)
        {
            using (var txtWriter = new StreamWriter(@"dump.txt",true))
            {
                string error = "Exception:" + ex.Message + 
                Environment.NewLine + "StackTrace:" + ex.StackTrace
                if(ex.InnerException!=null)
                    error=error+"INNER EXCEPTION:"+ex.InnerException;
                txtWriter.WriteLine(error);
            }
        }
 
       
    }
}  

With PDB this is the exception thrown by the application:

Exception:Attempted to divide by zero.
StackTrace:   at UnderstandingPDBs.Program.Divide(Int32 i, Int32 j) in C:\Users\Rishi\Documents\Visual Studio 2010\Projects\UnderstandingPDBs\Program.cs:line 33

 

Without PDB exception shows following message.:

Exception:Attempted to divide by zero.
StackTrace:   at UnderstandingPDBs.Program.Divide(Int32 i, Int32 j)
---------


Clearly the one with PDB shows line no. and file name of the class where exception is thrown.

How PDB is loaded by Debugger?:

The Visual Studio debugger expects the PDB file to be under same folder as the dll or exe. PDB files generated for the assembly are unique for each build, that mean you can't use the PDB created in previous build with the assembly created in any other build even if there are no code changes. Debugger finds out if the PDB is for that binary by comparing a specific GUID in PDB with that of the binary. This Guid is embedded during compilation in both binary and PDB which tightly links PDB with its binary. 

Different Build Settings in Visual Studio 

Visual studio has 3 different Build Options which controls the debug symbols generation:

1. none: PDB files will not be generated.

2. pdb-only: The debug symbols will be only in PDB files and not in Binary

3. Full: Along with symbols in PDB binary will also contain some debug symbols.

Full is the default option set in Visual Studio.

According to msdn

"If you use /debug:full, be aware that there is some impact on the speed and size of JIT optimized code and a small impact on code quality with /debug:full. We recommend /debug:pdbonly or no PDB for generating release code."

Should we deploy PDBs along with Binaries

If the size of deliverables are not concern its good to have PDB deployed along with other binaries as it helps provide more information about exceptions as we saw in above example. These PDB can be very helpful during certain crashes occuring intermittently for some user which without PDB will be make life difficult. 

It is not that you must have PDBs along with Binary deployed to get that extra information about the exception. The same can be achieved using Symbol Server and Source Indexing which i will discuss in below topic. 

Security Risk with PDB?: Anyone having access to dll/exe can easily do reverse engineering to generate the source code with or without PDB using tools like reflector. So not providing PDB will not do much help in this case.

If PDB is deployed and the user doesn't have access to binaries its not a good idea to show them the stack trace information and let them know internals of the application.

Symbol Server 

Symbol server is used to store pdb files which are known to debugger and can be used to find more descriptive call stack information. 

We can set up our own Symbol server using the symstore.exe, which allows debugger to find the actual PDB associated with binary in question. symstore.exe is included in the Debugging tools for Window package.

Microsoft also maintains Symbol server which we can use by loading the PDBs from Microsoft's Symbol server. 

How and Why to load Microsoft Symbol Store : 

When you stop execution at debug point and open Modules Window (as shown below)  you will find all the dlls(external or internal) loaded until that breakpoint but the Symbol status by default will show "Cannot find or open pdb file" except for your PDB. These are the Microsoft BCL binaries which are not loaded because our debugger can't find the related PDBs.

 

To load these symbols go to Debugging->Symbols and check Microsoft Symbol Servers and give the Cache symbols in this directory as any shared folders so that it can be used by all the developers.

Since these binaries are external to your application you also need to uncheck "Enable just my Code" in Debugging->General menu.

In the below screenshot you can see I have loaded the Symbol and now the status of the symbols shows "Symbols loaded".

So how this can be useful?
You can put breakpoint in your code and see the call stack with and without the symbols loaded.
The below figure shows the call stack without the Symbols loaded and it just displays my methods and the BCL's method as just [External Code] .

With the symbols loaded the call stacks displays all the method call prior to the breakpoint(see below fig). It  certainly can be helpful when we want to know what are the external methods called so that it can be analyzed using Reflector or by debugging Dissasembly ,while sorting out any particular issue in our application related to some behaviour change due to external code.


Call stack after Symbols are loaded

Like Symbol server there is also something called Source Server which are used to retrieve the exact version of source file used to build any particular application.Binaries can be source indexed when it is build and this information is stored in PDB file which help source server to find the exact source file.

You can check msdn to know more about Symbol and Source Store. 

 Points of Interest

PDB files are microsoft proprietary files and are least documented. I will be happy to learn more with your feedback, and will explore and publish more about it in the future.