{"id":9,"date":"2007-03-18T15:43:47","date_gmt":"2007-03-18T20:43:47","guid":{"rendered":"http:\/\/wp.javatechniques.com\/blog\/low-memory-deep-copy-technique-for-java-objects\/"},"modified":"2007-06-25T13:01:36","modified_gmt":"2007-06-25T18:01:36","slug":"low-memory-deep-copy-technique-for-java-objects","status":"publish","type":"page","link":"http:\/\/javatechniques.com\/blog\/low-memory-deep-copy-technique-for-java-objects\/","title":{"rendered":"Low-Memory Deep Copy Technique for Java Objects"},"content":{"rendered":"<p>&#8220;<A HREF=\"http:\/\/javatechniques.com\/blog\/faster-deep-copies-of-java-objects\/\">Faster Deep Copies of Java Objects<\/A>&#8221; explains the concept of &#8220;deep copies&#8221; of Java objects and illustrates a common approach for copying objects using Java Object Serialization. A downside of the illustrated approach is that it requires creation of a byte array containing the entire serialized representation of the object being copied. Creation of this array is inherently wasteful, as it is thrown away once the copy has been made. <\/p>\n<p>It would, of course, be possible to save the array and re-use it for subsequent object copies, though that may not be appropriate in some applications. An alternate approach is to use <CODE>java.io.PipedInputStream<\/CODE> and <CODE>java.io.PipedOutputStream<\/CODE> to serialize the object in one thread and simultaneously deserialize it in another thread. This technique is more complex, but allows the two halves of the operation to coordinate using a small buffer rather than a large byte array. Figure 1 shows an example copy utility that uses this approach.<\/p>\n<p><HR><\/p>\n<pre>\r\n\r\nimport java.io.IOException;\r\nimport java.io.PipedOutputStream;\r\nimport java.io.PipedInputStream;\r\nimport java.io.ObjectOutputStream;\r\nimport java.io.ObjectInputStream;\r\n\r\n\/**\r\n * Utility for making deep copies (vs. clone()'s shallow copies) of objects\r\n * in a memory efficient way. Objects are serialized in the calling thread and\r\n * de-serialized in another thread.\r\n *\r\n * Error checking is fairly minimal in this implementation. If an object is\r\n * encountered that cannot be serialized (or that references an object\r\n * that cannot be serialized) an error is printed to System.err and\r\n * null is returned. Depending on your specific application, it might\r\n * make more sense to have copy(...) re-throw the exception.\r\n *\/\r\npublic class PipedDeepCopy {\r\n    \/**\r\n     * Flag object used internally to indicate that deserialization failed.\r\n     *\/\r\n    private static final Object ERROR = new Object();\r\n\r\n    \/**\r\n     * Returns a copy of the object, or null if the object cannot\r\n     * be serialized.\r\n     *\/\r\n    public static Object copy(Object orig) {\r\n        Object obj = null;\r\n        try {\r\n            \/\/ Make a connected pair of piped streams\r\n            PipedInputStream in = new PipedInputStream();\r\n            PipedOutputStream pos = new PipedOutputStream(in);\r\n\r\n            \/\/ Make a deserializer thread (see inner class below)\r\n            Deserializer des = new Deserializer(in);\r\n\r\n            \/\/ Write the object to the pipe\r\n            ObjectOutputStream out = new ObjectOutputStream(pos);\r\n            out.writeObject(orig);\r\n\r\n            \/\/ Wait for the object to be deserialized\r\n            obj = des.getDeserializedObject();\r\n\r\n            \/\/ See if something went wrong\r\n            if (obj == ERROR)\r\n                obj = null;\r\n        }\r\n        catch(IOException ioe) {\r\n            ioe.printStackTrace();\r\n        }\r\n\r\n        return obj;\r\n    }\r\n\r\n    \/**\r\n     * Thread subclass that handles deserializing from a PipedInputStream.\r\n     *\/\r\n    private static class Deserializer extends Thread {\r\n        \/**\r\n         * Object that we are deserializing\r\n         *\/\r\n        private Object obj = null;\r\n\r\n        \/**\r\n         * Lock that we block on while deserialization is happening\r\n         *\/\r\n        private Object lock = null;\r\n\r\n        \/**\r\n         * InputStream that the object is deserialized from.\r\n         *\/\r\n        private PipedInputStream in = null;\r\n\r\n        public Deserializer(PipedInputStream pin) throws IOException {\r\n            lock = new Object();\r\n            this.in = pin;\r\n            start();\r\n        }\r\n\r\n        public void run() {\r\n            Object o = null;\r\n            try {\r\n                ObjectInputStream oin = new ObjectInputStream(in);\r\n                o = oin.readObject();\r\n            }\r\n            catch(IOException e) {\r\n                \/\/ This should never happen. If it does we make sure\r\n                \/\/ that a the object is set to a flag that indicates\r\n                \/\/ deserialization was not possible.\r\n                e.printStackTrace();\r\n            }\r\n            catch(ClassNotFoundException cnfe) {\r\n                \/\/ Same here...\r\n                cnfe.printStackTrace();\r\n            }\r\n\r\n            synchronized(lock) {\r\n                if (o == null)\r\n                    obj = ERROR;\r\n                else\r\n                    obj = o;\r\n                lock.notifyAll();\r\n            }\r\n        }\r\n\r\n        \/**\r\n         * Returns the deserialized object. This method will block until\r\n         * the object is actually available.\r\n         *\/\r\n        public Object getDeserializedObject() {\r\n            \/\/ Wait for the object to show up\r\n            try {\r\n                synchronized(lock) {\r\n                    while (obj == null) {\r\n                        lock.wait();\r\n                    }\r\n                }\r\n            }\r\n            catch(InterruptedException ie) {\r\n                \/\/ If we are interrupted we just return null\r\n            }\r\n            return obj;\r\n        }\r\n    }\r\n\r\n}\r\n<\/pre>\n<p><HR><br \/>\n<CENTER>Figure 1. Deep copy utility using <CODE>PipedInputStream<\/CODE> and <CODE>PipedOutputStream<\/CODE><\/CENTER><\/p>\n<p>The thread that calls <CODE>copy(&#8230;)<\/CODE> creates two connected piped streams and a separate thread to handle deserialization. The object is written to one side of the pipe and read (in the <CODE>Deserializer<\/CODE> thread) from the other. Given the non-deterministic nature of thread scheduling, there is no guarantee that the <CODE>Deserializer<\/CODE> thread will finish reading immediately after the calling thread has finished writing, so the calling thread will block if necessary until reading has finished.<\/p>\n<p>Figure 2 shows an example of copying a large object using the <CODE>PipedDeepCopy<\/CODE> class shown in Figure 1.<\/p>\n<p><HR><\/p>\n<pre>\r\n\r\nimport java.util.Hashtable;\r\nimport java.util.Vector;\r\nimport java.util.Date;\r\n\r\npublic class Example1 {\r\n\r\n    public static void main(String[] args) {\r\n        \/\/ Make a reasonable large test object. Note that this doesn't\r\n        \/\/ do anything useful -- it is simply intended to be large, have\r\n        \/\/ several levels of references, and be somewhat random. We start\r\n        \/\/ with a hashtable and add vectors to it, where each element in\r\n        \/\/ the vector is a Date object (initialized to the current time),\r\n        \/\/ a semi-random string, and a (circular) reference back to the\r\n        \/\/ object itself. In this case the resulting object produces\r\n        \/\/ a serialized representation that is approximate 700K.\r\n        Hashtable obj = new Hashtable();\r\n        for (int i = 0; i &lt; 100; i++) {\r\n            Vector v = new Vector();\r\n            for (int j = 0; j &lt; 100; j++) {\r\n                v.addElement(new Object[] { \r\n                    new Date(), \r\n                    \"A random number: \" + Math.random(),\r\n                    obj\r\n                 });\r\n            }\r\n            obj.put(new Integer(i), v);\r\n        } \r\n\r\n        int iterations = 10;\r\n\r\n        \/\/ Run piped version\r\n        long time = 0L;\r\n        for (int i = 0; i &lt; iterations; i++) {\r\n            long start = System.currentTimeMillis();\r\n            Object copy = PipedDeepCopy.copy(obj);\r\n            time += (System.currentTimeMillis() - start);\r\n\r\n            \/\/ Avoid having GC run while we are timing...\r\n            copy = null;\r\n            System.gc();\r\n        }\r\n\r\n        System.out.println(\"Piped time: \" + time);\r\n    }\r\n\r\n}\r\n<\/pre>\n<p><HR><br \/>\n<CENTER>Figure 2. Example of copying a large object<\/CENTER><\/p>\n<p>The good news is that the extra memory needed to make the copy is essentially bounded by the size of the buffer used by <CODE>PipedInputStream<\/CODE>. By default, this buffer is 1024 bytes. The bad news is that this approach is more than twice as slow as even the unoptimized variant of the deep copy utility illustrated in &#8220;<A HREF=\"http:\/\/javatechniques.com\/public\/java\/docs\/basics\/faster-deep-copy.html\" onMouseOver=\"window.status='Last modified 7\/15\/06 11:08 PM by isenhour (http:\/\/javatechniques.com\/public\/java\/docs\/basics\/faster-deep-copy.html)'; return true;\">Faster Deep Copies of Java Objects<\/A>&#8220;. The reason for this is the added overhead of coordination between the two threads. In the example in Figure 2, copying the 700K object requires that the buffer in <CODE>PipedInputStream<\/CODE> be read and overwritten 700 times. The two threads must do a fair bit of handshaking, with the <CODE>ObjectInputStream<\/CODE> waiting for the <CODE>ObjectOutputStream<\/CODE> to write needed data to the buffer, or the <CODE>ObjectOutputStream<\/CODE> waiting for the <CODE>ObjectInputStream<\/CODE> to pull data from the buffer so that more can be written.<\/p>\n<p>Increasing the size of the buffer used by <CODE>PipedInputStream<\/CODE> can help. Unfortunately, <CODE>PipedInputStream<\/CODE> does not provide a way to change the buffer size. It can, however, be easily subclassed to provide this capability. Figure 3 illustrates this.<\/p>\n<p><HR><\/p>\n<pre>\r\n\r\n    import java.io.PipedInputStream;\r\n\r\n    \/**\r\n     * PipedInputStream subclass that allows buffer size to be set to\r\n     * a value larger than the default 1024 bytes.\r\n     *\/\r\n    private static class AdjustableBufferPipedInputStream extends \r\n        PipedInputStream {\r\n\r\n        public AdjustableBufferPipedInputStream(int bufSize) {\r\n            super();\r\n            buffer = new byte[bufSize];\r\n        }\r\n    }\r\n\r\n<\/pre>\n<p><HR><br \/>\n<CENTER>Figure 3. <CODE>PipedInputStream<\/CODE> subclass that allows buffer size to be specified in the constructor.<\/CENTER><\/p>\n<p>This helps some, but the overhead of thread coordination is still a limiting factor. If, for example, the buffer size is set to a value larger than the serialized size of the object being copied (effectively eliminating the memory advantages of the piped approach), the copy still takes nearly 50% longer than the <A HREF=\"http:\/\/javatechniques.com\/blog\/faster-deep-copies-of-java-objects\/\">single-threaded approach<\/A>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;Faster Deep Copies of Java Objects&#8221; explains the concept of &#8220;deep copies&#8221; of Java objects and illustrates a common approach for copying objects using Java Object Serialization. A downside of the illustrated approach is that it requires creation of a byte array containing the entire serialized representation of the object being copied. Creation of this &hellip; <\/p>\n<p class=\"link-more\"><a href=\"http:\/\/javatechniques.com\/blog\/low-memory-deep-copy-technique-for-java-objects\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Low-Memory Deep Copy Technique for Java Objects&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"open","template":"","meta":{"footnotes":""},"class_list":["post-9","page","type-page","status-publish","hentry","entry"],"_links":{"self":[{"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/pages\/9","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/comments?post=9"}],"version-history":[{"count":0,"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/pages\/9\/revisions"}],"wp:attachment":[{"href":"http:\/\/javatechniques.com\/blog\/wp-json\/wp\/v2\/media?parent=9"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}