Source Code & Software Patents: A Guide to Software & Internet Patent Litigation for Attorneys & Experts
by Andrew Schulman (http://www.SoftwareLitigationConsulting.com)
Detailed outline for forthcoming book
Chapter 22: Comparing source code to the commercial product at issue
Chapter 2 has already discussed the relationship between source code on the one hand, and the accused instrumentality (generally a commercial product or service) on the other, as well as the relationship between source code and prior art used to show anticipation, obviousness, on-sale, or public-use. To recap, where “x” is a claim limitation, or an entire claim, x’s role in the source code may not reflect x’s role in a product/service because:
- x represents a not-used experiment (though it may still be significant e.g. as “scaffolding”)
- x represents old, outdated, “dead code,” not built/compiled/linked into the product
- x is used internally to automatically generate source code y, and y is in the product
- x was used in an older product (which may nonetheless be within the SOL/laches period, or otherwise significant)
- x is going to be used in a forthcoming product (see chapter 14 on supplementing the complaint)
- x is in the product, but is never executed/invoked (see chapter 2 on “latent code”)
- x is in the product, and is executed/invoked, but hardly ever (see chapter 5 on “turning (apparent) mountains into molehills”)
- x is in the product, and plays an important role, but the accused infringer can argue that it is so slow, error prone, or resource-intensive that it damages should reflect its negative role (see chapter 30 on damages)
- x is being used as anticipatory prior art, the source code was not publicly-accessible at the relevant time, and the publicly-accessible product/service was not capable of teaching the PHOSITA (see chapter 4)
This chapter provides the detailed mechanics of how, given source code produced in discovery, the examiner can compare this specific source code with the commercial products at issue, both for infringement and invalidity analysis.
22.1 How to show that selected source code corresponds to the accused product
- One reason is to verify that correct source code was produced in discovery
- Another reason is to verify that, if source code has been produced for multiple versions, source-code examiner has selected the correct version for each accused instrumentality
- One method is to examination of build/make scripts included with the source code, to determine which source-code files are used to build the product
- Another method is to take results from static reverse engineering of product (see chapter 6) and ensure that e.g. strings seen in selected source code are also seen in the product
- Aligning source code with products is a hard problem: see “Challenges of Binary Verification” section in firmware reverse engineering article: “The exact contents of the binary are strongly tied to the toolchain and overall environment they were compiled in….,” and referencing paper on difficulties verifying that TrueCrypt was not “back-doored” from source (“the output of compilers is often dependent on parameters that can be strongly tied to the building environment”)
- Dealing with “dead code”, #ifdefs, and other conditional compilation; can “dead code” ever infringe (e.g. as “scaffolding”)?
- Even if selected source code was used to build product (as shown e.g. in make/build script), it still may not be expressed in product: e.g. abstract/virtual functions or templates may appear relevant, but if not used, are likely not instantiated in the product (not even as uncalled “latent code”)
- Given multiple accused products and/or versions, the source code should be correlated to each accused product/version
- Representative instrumentalities: can this source code for product x be used to show the inner workings of related product y?; see chapters 7 and 26
22.2 How to show that selected source code is actually called/invoked in the accused product
- Even if selected source code is expressed in the accused instrumentality, it is possible that the code is not called/invoked
- “Latent code” cases (Finjan, etc.): even un-called code may still infringe an apparatus claim, but is unlikely to infringe a method claim; similarly, un-called code may be evidence of making/selling infringement, but not of “using” infringement
- That selected code is called/invoked can in part be shown by static analysis of the code, by tracing from some known-invoked point (e.g. “main”) down to the code of interest (see chapter 19 on tracing)
- However, tracing by the source-code examiner is fallible; e.g., seemingly vital code may be bypassed due to error/exception handling; such bypassing may also reveal the presence of non-infringing alternatives
- The better way of showing selected code is actually called/invoked is by dynamic reverse engineering of the product, i.e. run-time testing/instrumentation of the accused instrumentality (see chapter 6, including dangers of drawing certain types of conclusions from dynamic RE)
- Dynamic reverse engineering methods include logging, debugging, and network monitoring
- Confirm that messages/data seen in code are actually sent/received: see ch. 6 on network monitoring
- Confirm that filenames, settings, etc. seen in code are actually opened/read/written: see ch. 6 on OS monitoring tools
22.3 How to determine the role or importance of the selected source code to the accused product
- Is this piece of code’s seeming importance within the source code, of similar importance within the accused product?
- While assessing role/importance for damages calculations may largely require non-technical economic evidence (see chapter 30 on damages), role/importance also requires technical evidence
- Can the selected source code be directly linked to a visible feature in the user interface? (e.g., selecting this menu item invokes this source code; this portion of the app’s GUI is created by this source code)
- Infringement should be “more than de minimus” [give examples of what de minimus infringement would look like]
- Infringing code may belong to a small separable component, rather than to the entire accused product (see also chapter 5 on defendants, and chapter 30 on damages)
22.4 Anticipatory prior art: How to determine whether selected source code was represented in a product, in a publicly-accessible way, at the relevant time
- Employing source code to show invalidity is not an exact mirror image of using source code to show infringement, because anticipatory prior art must not only incorporate all limitations of the claim (“all-elements rule”; but see combining of multiple references for obviousness) — the prior art must have been publicly-accessible to (and capable of teaching) the PHOSITA at the relevant time
- Thus, finding anticipation in source code which was proprietary at the relevant time in insufficient for anticipation; but see on-sale and public-use bars, which do not require publicly-accessibility or teaching (see chapter 4)
- Anticipatory prior art found in source code must be shown to have been sufficiently expressed in a publicly-accessible activity or “publication,” such as a commercial product
- This can be shown with static reverse engineering of the contemporaneous commercial product
- See chapter 5 on locating older software for use as prior art
- Sometimes older versions of products will be included (perhaps unintentionally) as part of source-code productions; these can be examined using tools available on the source-code machine (such as “strings” on Macs, or by writing a small VisualBasic or PowerShell script on a Windows computer)
22.5 How to use product testing to confirm, corroborate, or visibly demonstrate claim limitations found in the source code
- Apart from proof that the selected source code is reflected in the accused product, is invoked/called, and plays some non-de minimus role, another reason to supplement source-code examination with product inspection is to create additional, independent sources of information for what has been learned from the source code
- Inspection of the product will often yield more visually-compelling evidence than does the source code
- See 22.3 above on how to tie the selected source to something visible in the product’s user interface (GUI)
Is this source code actually important?
- Many questions about the role of allegedly infringing code cannot be answered with (or solely with) technical information, and require marketing and sales information, and an economics or accounting expert, or some characterization by the other side of how important the code is (see chapter 29 on correlating source code with non-source documents). However, there is obviously also a technical side to code importance, starting with the most basic questions of whether code the examiner has selected (e.g., for use on the right-hand side of an infringement or invalidity claim table) is present in the relevant place.
- For infringement and non-infringement analysis: is this code really in the product?; source code may be “dead code,” excluded from compilation because the file is not compiled and/or shipped with the product, or because portions of the source-code file are excluded (e.g. #ifdef NEVER, #if USE_ONLY IN_ATARI_VERSION, etc.).
- For invalidity analysis: was this source code publicly accessible before the patent’s priority date, and/or was it used to build a product that was publicly accessible and if so, was there sufficient reflection of the source code in the product such that the PHOSITA at the time could have accessed it?; note that this does not matter for on-sale, public-use.
- On the other hand, even “dead” or unused code may have been crucial to the development of a revenue-generating product; look for signs that dead code was necessary “scaffolding” (e.g., was a source file now used to build the product part of a natural evolution from an older file?).
- Apart from “dead code,” another way that parts of source code might not make its way into public and/or a product is when the code is abstract or virtual; watch out for templates, virtual functions, or preprocessor macros which must be instantiated, called, or otherwise used to be placed outside proprietary source code. Actually, with an optimizing compiler this could also be true of any uncalled functions. The examiner is safest with the rule (though not infallibly true) “Don’t point to any code which isn’t used” (i.e., called or instantiated, and the instantiation called).
- If the selected code truly was present in public and/or in a product, is the code ever actually called?; while apparatus claims may be infringed even by “latent code,” method claims likely aren’t. See Finjan and other latent code cases.
- Determining whether a piece of code is actually called may require dynamic examination of the running product, as well as static examination of the source code. Remember however that dynamic examination will only tell you what happened in the particular configurations which were tested and cannot, without some static analysis (either of source code or static reverse engineering e.g. disassembly or decompilation), definitively answer questions such as “is this code ever called?” or “is this code ever bypassed?”
- If the selected code truly is called, beware of assuming that it’s always called, usually called, or conversely almost never called. Are there alternative paths through the code which could render the examiner’s beautifully on-point code selection less relevant than it seems?:
- Can the code be overridden or bypassed?
- Is there error or exception handling (including SHE) which could make the code bail out before completion of a method?; see error handling below.
- Conversely, does selection of this code first require some error or exception or special case?; e.g. if (something_that_happens_only_during_inhouse_test) call function_whose_name_is_very_helpful_to_client_case();
- If the selected code is located in the product, and is called at least sometimes, it may be possible to give a technical opinion on importance of selected code within a given product (though likely not whether consumers buy the product in order to buy the selected code, nor the role of the product within the market) simply by noting whether the selected code is always called whenever the product is started, or whenever a given user-visible feature is selected. Such an opinion should generally be based on examination of the product itself as well as the source code, using dynamic and/or static reverse engineering of the product; see below.
- Much of the above boils down to the examiner asking himself whether he knows and can explain to someone who hasn’t seen the code: Who calls the selected code? What are the circumstances in (or triggers for) which the selected code is called?
- See also chapter 19 on tracing up and down (tracing up to who calls the selected code helps show that the selected code is actually invoked/called; tracing down to who the selected code calls helps show that the selected code actually does what its name implies) [rewrite to try to make this clearer]
Using static & dynamic analysis to correlate source code with product
- As noted above, source code examination should generally be partnered with examination of the commercial product, which examination can be dynamic (e.g., logging, network packet monitoring, file-system monitoring, or debugging) or static (e.g., file dump, disassembly, or decompilation). Chapter 6 discussed reverse engineering the product to learn product internals in advance of examining the source code; and earlier in this chapter the product was examined to confirm the presence of code; here, the emphasis is on dynamic analysis of the running product when static analysis of the source code is insufficient to prove what the code does.
- For example, dynamic analysis may be required to show that selected code is truly executed; however, be wary of overgeneralizing from dynamic analysis, which by itself can only prove that the selected code was executed in the particular configurations tested.
- Dynamic analysis may be necessary when the source code makes clear that execution of selected code is data-dependent, and the data is only known at run-time; e.g., if (this_thing_external_to_source_code == some value), then invoke this infringing code, else invoke some other possibly non-infringing code.
- Relevant code is possibly only executed upon occurrence of what appears to be some error, exception, or boundary condition, which dynamic analysis will reveal happens very frequently in the tested configurations. For example, in Lucent v. Gateway/Microsoft, a run-time error caused selection of a particular media encoder; it was necessary to know how often this error occurred (liability comes from even a single occurrence, but significant damages would depend on frequent occurrence). Note that infrequent occurrence may show the presence of non-infringing alternatives.
- A dynamic product examination is the recommended alternative to merely assuming that selected code is used to implement a visible product feature. The examiner should be able to show (with a combination of screenshots and code in the claims table) the path from some trigger in the GUI (user selects menu or dialog item, etc.) to the execution of the selected code. [Also, can something in the GUI be used to show what the PHOSITA could have gleaned from the product?]
- If I wanted to demonstrate use or non-use of the patented invention to the fact-finder, without showing them the source code itself, is there a way I could do this? Can I visually pinpoint the difference between infringement vs. non-infringement in the GUI, or can I merely show the generally relevant aspect of the product about which the dispute occurs?
- Does the source code include logging statements which, were logging enabled in the commercial product, would show “live” execution of significant points in the source code?
- Can I trace from visible GUI elements to the source code which is triggered when the GUI element is selected? Do I have everything needed to do that, including e.g. a resource file with IDs which are referenced in the source code?
- Logging files located with the source production, non-source production, or even posted to newsgroups, may reveal run-time behavior which is important to understanding the code. If dates of the postings can be authenticated, this can also establish public accessibility on a certain date. [Give example in which posted SiSoftware Sandra logs showed use of microprocessor bit.]
- There’s an important general reason for performing dynamic analysis, even when the source code is crystal clear, and the examiner is positive: one wants to have multiple, independent, sources of information for each asserted match or non-match. [Cite forensics literature on multiple independent sources.] This is especially important when viewing source code with a tool or technique supplied by someone other than the examiner (e.g. SciTools Understand). In addition to knowing how the tool works, what limitations it has or what different input or configurations could affect its output, the examiner will also want to be able to point to some completely independent source of information for the same point.
- In some cases, the examiner should write small scripts or test programs on the source-code machine.
- Given a source-code production, independent sources of information include the publicly-available product, any publicly-available source code, marketing literature, and internal documents such as specs and emails produced in discovery. See chapter 29 on correlating source code with non-source documents.
- Multiple (but not necessarily independent) sources of information: using different tools (both off-the-shelf programs and ad-hoc tests), dynamic as well as static analysis, examining the product as well as the source, using both the top-down and bottom-up search methods described at chapter 19 on tracing, and using diagrams as well as words to inspect code.