FTP Sync Process
Context and Problem Statement
File transfer is one of the simplest ways to share information between two systems. Because of this simplicity, several of the providers (health plans) we integrate with choose this solution to share information with us.
Problems
While sharing information using files is a simple solution to a complex problem, automating the transfer of such files to the system that needs them is not so simple.
Some of the problems we have identified during previous integrations are:
- the owner of the FTP server can delete files whenever they want
- identifying when the content of a file changed and needs to be downloaded again
- hard to keep track of every file downloaded and uploaded
- retry logic when the FTP server is not available or the connection is lost
- lack of test environment
Use cases
Download files We receive two types of files, those that are processed by Arc (e.g rosters), and those that contain information needed by the Care Team (e.g. reports) but are not part of any automated process. Files that Arc doesn't processed, must be made available to the Care Team for review. Today, someone from the Care Team would have to connect daily to each FTP server to check for new reports and download them. This is a tedious and error-prone process.
Upload files As of the time of writing this, the only case where we upload files to an FTP server is when we send reports to the provider.
Proposed solution
This solution aims to centralize the sending and receiving of files via FTP in one place and provide a simple interface for developers to integrate new providers without worrying about infrastructure and manual testing. In addition, it allows us to have a record of all files sent and received, and make them available to the Care Team for review.
Use a cloud storage to centralize all the files that need to be imported into Arc, or exported to an external FTP server. A recurrent process will be responsible of copying the files from the FTP servers and this storage.
Arc's business processes are responsible of:
- storing files in the appropiate cloude storage,
- processing incoming files that were synced from the FTP servers
Tool used to copy/sync files
The options analyzed were:
- rclone(-like) command called from Ruby with a wrapper
- Alternative: host rclone server and send commands to it
- Custom build code that uses FTP/Files.com/AWS gems
We decided that, in the long run, it's better to have a customized solution that fits our needs exactly and does not have to be workarounded to implement new features. In addition, integrating a third party CLI program would require additional infrastructure and monitoring work that we prefer not to invest in.
Making files available to Care Team
The first iteration will make files available to Care via Slack. This will involve posting a new message every time the Sync process finds external files that need to be copied to S3.
Further iterations may include the addition of a new UI app, where they can view the files.
Technical details
See sync/ftp.md.