Instead of DTO
Objects use data messages to communicate with each other. It means that object methods can accept some data, but data structure sometimes is too complex. When complex data message is designed wrongly it tends to reduce maintainability because it becomes harder to test this code, and harder to read and understand it. Many people use DTOs for object messages, just because it’s easier to implement, but this code will be less readable in future and has a lot of hidden drawbacks, e.g. broken encapsulation. Instead of this, data languages should be used for complex data structures. It moves data definitions from source code and lets the code encapsulate the data and concentrate on the object’s behavior.
The problem
One of the examples of complex data format is when you have a service with multiple implementations which should receive some data to process. Let’s take an email message with a service to send as an example for this post. In this example the email message can be send in two different ways: via SMTP service or via external API. The wrong way (but quite popular) is to present an email as a DTO class and accept it in message sender interface:
class MailDTO {
String subject;
String body;
String address;
Iterable<String> cc;
Iterable<byte[]> attachments;
String signature;
}
interface MailService {
void send(MailDTO mail);
}
People usually use getters and setters for DTO instead of public fields
but I don’t see any difference comparing to “all-public-fields” DTO.
It’s not actually an object, but data-holder. This class breaks encapsulation
and is not testable. It makes it harder to write unit tests for MailService
implementation,
because you don’t actually know which DTO fields will be used internally and you always need
to construct a working copy of this class for each unit test. In
worst case scenario
this DTO comes from external library. It has private fields with public getters and it’s constructed
internally using reflection
so it’s not possible to test DTO receivers without some dirt in tests like
Mockito or reflection. It’s really hard to
maintain such tests, just look at this code to understand why:
@Test
void sendsMailViaSmtp() {
MailDTO mail = Mockito.mock(MailDTO.class);
Mockito.when(mail.getAddress()).thenReturn("qwe@asd.com");
Mockito.when(mail.getBody()).thenReturn("hello");
Mockito.when(mail.getCcList()).thenReturn(ccList);
Mockito.when(mail.getAttachments()).thenReturn(attachments);
Mockito.when(mail.getSignature()).thenReturn("test");
new SmtpService(...).send(mail);
}
Instead of focusing toward testing the logic, programmer has to read or write a dosen of mocking lines. You need to stop mocking if you want to make your tests clearer.
Another huge problem of this DTO is broken encapsulation: it’s pretty fine for procedural code to use DTO, since this paradigm requires data to be open, but this post is about OOP, where encapsulation is one of the core principles.
Solution
The correct way for this example will be:
- hide the data by encapsulation it as an object’s state
- revert communication direction (from “service is sending mail” to “mail sends itself via service”)
- and use data languages to communicate with mail services
interface Mail {
void send(MailService srv);
}
interface MailService {
void accept(XML message);
}
I’m suggesting to build custom data protocols, when complex data structures should be passed between objects. In this example I’ll use XML to pass data from mail to service, it has some advantages over previous solutions:
- validation - we can enforce the protocol with
xsd
schemas and fail methodaccept()
if xml is invalid - queries -
MailService
can use xpath queries to access the data - readability - XML has readable format, so it’s easier to view XML file instead of using debugger to inspect DTO instances
- allows you to build complex data structures - XML is a flexible language to define complex data structures
- transformations - XML data structure can be transformed using
xsl
transformations - flexibility - a mail object can construct a data message from internal state, or decorate existing object to put additional data.
The disatvantages are:
- complexity - it’s too complex for simple data structures
- knowledge - it requires additional knowledge in XML language to design such structures
It’s better to start with xsd
schema to define data structure, but
it’ll be over-complex for a simple blog post, so I skip it here.
If you are not familiar with xsd
schemas you may start learning them
here: www.w3schools.com
To pass it to MailService
we can use XML
object from
jcabi-xml:
void accept(XML data);
So now, mail services can use xpath queries to query the data:
class MailSmtp implements MailService {
@Override
public void accept(final XML mail)
throws IOException {
final String address = mail.xpath("/mail/recipient/text()").get(0);
final String subject = mail.xpath("/mail/subject/text()").get(0);
final List<String> ccs = mail.xpath("/mail/ccs/cc");
// TODO: send via SMTP
}
}
On the other hand, Mail
implementations can use
Xembly
language to build XML
object message using directives
(pay attention: this class doesn’t expose internals and doesn’t break encapsulation, it rather
constructs a message to another object using internal state):
class MailSimple implements Mail {
private final String subj;
private final String text;
private final String address;
public void post(final MailService svc) {
svc.accept(
new AsXml(
new Directives()
.add("mail")
.add("subject").set(this.subj).up()
.add("address").set(this.address).up()
.add("body")
.add("text").set(this.text).up()
)
);
}
}
The idea is simple - hide the data in Mail
object, build XML in Mail
implementations
from encapsulated state and pass it to accept()
of MailService
:
new MailSimple(
"Test mail",
"Hello",
"test@test.com"
).post(new Smpt(connection));
One of the advantages is flexibility: these classes are easy to wrap, e.g. here is a decorator to add CCs to origin mail:
class MailWithCC implements Mail {
private final Mail origin;
private final Iterable<String> ccs;
@Override
public void post(final MailService svc) {
this.origin.post(
mail -> svc.accept(
new AsXml(
new Directives(Directives.copyOf(mail.node()))
.xpath("/mail")
.addIf("ccs")
.append(
new IoCheckedScalar<>(
new Reduced<>(
new Directives(),
(dirs, cc) -> dirs.add("cc").set(cc).up(),
this.ccs
)
).value()
)
)
)
);
}
}
This class will add CC-list to existing mail by updating XML data:
new MailWithCC(
new MailSimple(
"Test mail",
"Hello",
"test@test.com"
),
"copy@test.com"
).post(new Smpt(connection));
Or service itself can be decorated in a similar way:
class WithCc implements MailService {
private final MailService origin;
void accept(XML message) {
origin.accept(
new AsXml(
new Directives(Directives.copyOf(mail.node()))
.xpath("/mail")
.addIf("ccs")
.append(
new IoCheckedScalar<>(
new Reduced<>(
new Directives(),
(dirs, cc) -> dirs.add("cc").set(cc).up(),
this.ccs
)
).value()
)
)
)
}
}
Another important advantage of this approach is that it’s easy to unit-test these classes:
@Test
void appendsCcs() throws Exception {
new MailWithCC(
new Mail.FAKE,
new ListOf<>("copy@test.com")
).post(
mail -> MatcherAssert.assertThat(
mail.node(),
XhtmlMatchers.hasXPaths("/mail/ccs/cc[./text() = 'copy@test.com']")
)
);
}
When you change your data format (and update xsd
schema), you may write an
xsl
transformation to update the old version to new one on “data side”, so you just change
the code to support only the new format and apply transformations to convert the old data.
Conclusion
To summarize it all - we’re spending more time on implementation, since we need to write all these schemas, xml manipulators, etc., but saving much more time on maintaining the code and making it more readable. But you always should think about the balance between the cost of implementing and cost of maintaining: I’d never use XML language for simple data messages, e.g. if by business requirements a mail can contain only a message and address, nothing more; it would be easier to put these properties right in the method, since creating XML definitions for that case will be too expensive for the project.