What if WSDL lies?

Posted by Wojtek Dziegielewski on May 28, 2021

Imagine a SOAP/WSDL service that has been working reliably for years. You have a C# client application that consumes this service using a WCF Service Reference. It all works well until one day your application suddenly crashes with an inexplicable error, such as a NullReferenceException. After long troubleshooting, you notice a problem with the service response: the response proxy class does not contain the expected data. What happened?

Is the service telling the truth?

A SOAP service is generally accompanied by a service definition, i.e. a WSDL document. This definition is used by the service consumer to prepare the service requests and interpret its responses. This is exactly what the WCF proxy classes (auto-generated upon setting the Service Reference) are designed to do. However, there is nothing that forces the service to act according to the service definition. The service vendor may change the service behavior without updating the WSDL document. Or, the Service Reference in the consumer application do not get updated to the new definition. Or, perhaps a non-compliant response occurs only in rare edge cases that never made to the WSDL. Or…

The bottom line is tha the service may not respond exactly as prescribed in its WSDL document. If this happens, your consumer application is broken.

WCF proxy classes and data contracts

  • What does the code generated by WCF (upon setting a Service Reference) look like?

  • How does the generated code reflect the service definition (WSDL)?

  • What will happeen if the service returns data that is not defined in the service definition? Will such data be somehow reflected in the generated proxy classes?

Note that the Service Reference is specific to .NET Framework. In .NET Core, there is a Connected Service in its place. The resulting auto-generated code is essentially the same.

The last question is of most interest to us. After all, in our scenario above, the service response is inconsistent with its WSDL causing our consumer application to crash. Wouldn’t be nice to see all data returned by the service without resorting to parsing the raw http response?

WCF-generated proxy classes account for all data elements regardless if defined in the WSDL or not. This is because object serialization/deserialization is by default done using DataContractSerializer, which adds an ExtensionData property (of the ExtensionDataObject type) to proxy classes that represent service requests and responses.

The ExtensionData property contains a set of name-value pairs that contain those elements that are not defined in the WSDL (of course those elements defined in WSDL are deserialized as discrete properties of the proxy class). The contents of the ExtensionData property can be viewed using the Visual Studio debugger.

A side question: what is the rationale behind adding this extra data? The answer: to facilitate forward compatibility and avoid data-loss during round-tripping. Take for example a service that has two methods GetBook and SetBook. The GetBook response matches the SetBook request, they’re both of a Book type. A service consumer (library application) uses these methods to update the book rating data. Say, version 2 of the service adds a new Book property called PageCount. Our library application however still uses version 1, hence is unaware of this new property. The application can safely call the GetBook/SetBook methods to update the book ratings. The PageCount data will not be lost; it will be preserved during the round-trip in the ExtensionData property.

Extracting data from ExtensionDataObject

Viewing data in the Visual Studio debugger is one thing. But how to access the data held in the ExtensionData property programmatically? As Microsoft states in the article on Forward Compatible Data Contracts:

“The ExtensionDataObject type contains no public methods or properties. Thus, it is impossible to get direct access to the data stored inside the ExtensionData property.”

Here is an example on how to accomplish the “impossible”. The code below is based on the sample code from the article describing the ExtensionDataObject and works in .NET Core 3.1 as well as .NET 5.0. It should also work on the actual service responses. This sample code uses reflection and is quite brittle. Note that not only the members of the ExtensionDataObject class are non-public, but so are the types of these members. There is no guarantee that future .NET version will have the same internal implementations.

using System.Runtime.Serialization;
using System.Reflection;
    ...
    response = //... consume the service here ...
    ...
    var members = (IList)GetPrivateField(response.ExtensionData, "_members");
     foreach (var member in members)
      {
         var name = (string)GetPrivateField(member, "_name");
         var val = (string)GetPrivateField(GetPrivateField(member, "_value"), "_value");
         Console.WriteLine($"{name}: {val}");
      }

   private static object GetPrivateField(object subject, string propName)
   {
      return subject.GetType().GetField(propName, BindingFlags.Instance | BindingFlags.NonPublic).GetValue(subject);
   }

Conclusion

When WCF is used to consume SOAP Services, the ExtensionData property comes handy in cases where the consumer code becomes outdated due to a service definition mismatch. The property prevents data loss during round-tripping by holding those data elements that the consumer is unaware of. However, the contents of the ExtensionDataObject class are not easily accessible. These common-sense alternatives are generally preferred:

  • Obtain the current version of the service definition (WSDL), use it to set the Service Reference and rebuild your service consumer application.

  • If the correct version of the service definition is not available, consider foregoing dependency on WCF. One alternative might be to use an HttpClient to consume the service and manually parse the response received.

But, if you feel adventurous, you can use WCF to consume a SOAP service and still retrieve all data returned, even if not defined in the service definition.


Comments on this post are handled by utteranc.es, an open-source, lightweight commenting system that uses GitHub issues. To leave a comment, a GitHub account is needed and also utteranc.es must be authorized to post an issue on your behalf. This can be done by clicking the "Sign in to comment" button. No advertising or user tracking will be used.