Contributed by Bryce (bnordgren). See https://issues.roundup-tracker.org/issue2550677 for initial submission. The issue also has the results from experimenting with it in a live tracker.
Bryce's note is:
I made a generic ClassInterceptor designed to be mixed in with Class or FileClass at runtime as directed by schema.py (hence supporting all backends). The two initial uses of this class are:
- compression of "content" (for messages primarily composed of text) and
- suppression of duplicate file entries (handy for people who have "image signatures" setup in their email client.) It could also be used for encryption of file content on disk. The attached file contains the code. Use of the file in schema.py is as follows:
from interceptor import interceptor_factory, GzipFileClass, UniqueFileClass # Make a message file class which compresses the content before storage. MessageFileClass = interceptor_factory('MessageFileClass',GzipFileClass,FileClass) # Make a file class which only stores unique files. DataFileClass = interceptor_factory('DataFileClass',UniqueFileClass,FileClass)
The file is attached to this page and can be downloaded using interceptor.py.
Additional notes from the issue from experiments.
To use this put in the tracker's lib subdirectory (create if missing).
Then add at the top of schema.py:
from interceptor import interceptor_factory, GzipFileClass, UniqueFileClass # Make a message file class which compresses the content before storage. MessageFileClass = interceptor_factory('MessageFileClass',GzipFileClass,FileClass) # Make a file class which only stores unique files. DataFileClass =interceptor_factory('DataFileClass',UniqueFileClass,FileClass)
then later in schema.py replace FileClass:
# FileClass automatically gets this property in addition to the Class ones: # content = String() [saved to disk in <tracker home>/db/files/] # type = String() [MIME type of the content, default 'text/plain'] msg = MessageFileClass(db, "msg", author=Link("user", do_journal='no'), recipients=Multilink("user", do_journal='no'), date=Date(), summary=String(), files=Multilink("file"), messageid=String(), inreplyto=String()) file = DataFileClass(db, "file", name=String())
as shown above.
I have not found any issues with the gzip of messages. The gui displays msg files on disk in both uncompressed (before schema change) and compressed (after schema change).
Using "zcat db/fles/msgs/0/msgs3" will show the contents of the compressed message.
DataFileClass is interesting but has the following UI issues:
- When attaching the same file with a different name, the new file doesn't show up. Understandable since it is already attached. However it just looks like nothing happened. It should raise a message saying that the file already exists attached to the ticket under the other name.
- Assume we have two files with the same contents: a.txt and b.txt.
- . attach a.txt (designated file1) to issue1 first then attach b.txt to issue2. On issue2, a link is added to file1 reusing the exiting uploaded file so only one copy is kept on disk. The filename displayed on issue2 however is a.txt (the name of the first file to be attached). Not sure how to solve this. This could allow data leakage. Also a reference in issue2 to "b.txt" will be confusing.
I wonder is something similar could be done for those people who want to:
- put files as blobs in the database
- cryptographically sign files/messages to prevent tampering
- use mercurial as a method to replicate files to a backup system and also eliminate unobservable tampering