Challenge #15: Finding differences in datasets


You are a data center administrator. You have 100 machines carefully configured and running smoothly.

Each machine has a mac_address, an IP, a network name and an operating system installed. You keep track of all these things like so:


The Challenge

One night, a horde of drunken cats invade the data center.

They randomly turn machines off, turn decommissioned machines on, or mess with the configuration and OS of the machines.

The only thing they could not tamper with were the mac addresses.

The next day you take stock of the state of the data center to check what you need to fix to get back to normal.

For this challenge, you need to compare the original state of the data center with the sabotaged state. Begin with start.dfl (306.1 KB)

Screenshot 2020-09-14 at 14.14.14

Your data flow should determine which mac_addresses need administrative action.

  • machines that were turned off need to be booted up
  • any extra machines that were turned on need to be shut down
  • if the cats changed the operating system, specify which system should be installed
  • if the cats changed the ip specify which ip to use
  • if the cats changed the name, specify which name to use

Your flow should produce at a data set like this:

Hint: The Diff on sorted keys step is helpful in determining any changes between data sets.