Memory allocation and Java crashes #42

saisirandas · 2021-04-05T19:37:41Z

My apologies if this isn't correct place to ask this question. I couldn't find a user forum for TM2. Is there one?

I have a question about frequent "Java stopped working" crashes on TM2 model runs. They seems to occur during the second or third iteration of CTRAMP. Below are details of the model run and computing environment:

Select county 3 (Santa Clara)
Sample rate 1.0 for all TAZs
Windows Server 2016 Virtual Machine
Intel Xeon Gold 6134 CPU @ 3.20 GHz with 32 (virtual) cores
512 GB RAM

Below is the event log message at the time of crash:

02-Apr-2021 09:29:34:640, ERROR, Exception exception making RMI method call: //10.1.0.80:1191/com.pb.mtctm2.abm.ctramp.MatrixDataServer.writeMatrixFile().
java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
java.net.SocketException: Connection reset
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:254)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:163)
at gnu.cajo.invoke.Remote_Stub.invoke(Unknown Source)
at gnu.cajo.invoke.Remote.invoke(Unknown Source)
at com.pb.mtctm2.abm.ctramp.UtilRmi.method(UtilRmi.java:123)
at com.pb.mtctm2.abm.ctramp.MatrixDataServerRmi.writeMatrixFile(MatrixDataServerRmi.java:41)
at com.pb.mtctm2.abm.application.MTCTM2TripTables.writeMatricesToFile(MTCTM2TripTables.java:514)
at com.pb.mtctm2.abm.application.MTCTM2TripTables.writeTrips(MTCTM2TripTables.java:495)
at com.pb.mtctm2.abm.application.MTCTM2TripTables.createTripTables(MTCTM2TripTables.java:275)
at com.pb.mtctm2.abm.application.MTCTM2TripTables.main(MTCTM2TripTables.java:687)
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readByte(DataInputStream.java:265)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:240)
... 9 more

Is this some sort of memory/resource allocation issue with Java? The CTRAMP parameters are configured as follows. I listed ones that I thought were relevant.

runDriver.cmd
java -server -Xmx256m -cp "%CLASSPATH%" -Dlog4j.configuration=log4j-driver.properties -Djppf.config=jppf-driver.properties org.jppf.server.DriverLauncher

runMTCTM2ABM.cmd
java -server -Xmx130g -cp "%CLASSPATH%" -Dlog4j.configuration=log4j.xml -Dproject.folder=%PROJECT_DIRECTORY% -Djppf.config=jppf-clientLocal.properties com.pb.mtctm2.abm.application.MTCTM2TourBasedModel mtctm2 -iteration %iteration% -sampleRate %sampleRate% -sampleSeed 0

java -Xmx480g -cp "%CLASSPATH%" -Dproject.folder=%PROJECT_DIRECTORY% com.pb.mtctm2.abm.application.MTCTM2TripTables mtctm2 -iteration %iteration% -sampleRate %sampleRate%

runMtxMgr.cmd
START "Matrix Manager" %JAVA_PATH%\bin\java -Dname=p%HOST_MATRIX_PORT% -Xmx480g -cp "%CLASSPATH%" -Dlog4j.configuration=log4j_mtx.xml com.pb.mtctm2.abm.ctramp.MatrixDataServer -hostname %HOST_IP_ADDRESS% -port %HOST_MATRIX_PORT% -label "MTCTM2 Matrix Server"

runHhMgr.cmd
START "Household Manager" %JAVA_PATH%\bin\java -server -Xmx32g -cp "%CLASSPATH%" -Dlog4j.configuration=log4j_hh.xml com.pb.mtctm2.abm.application.SandagHouseholdDataManager2 -hostname %HOST_IP_ADDRESS% -port %HOST_PORT%

jppf-clientLocal.properties
jppf.local.execution.threads = 26

mtctm2.properties
distributed.task.packet.size = 500

The text was updated successfully, but these errors were encountered:

lmz · 2021-04-12T19:19:08Z

Hi @saisirandas - There isn't a user forum for tm2, no, so this is a good place to post this. I'm not familiar with this error myself but I haven't been running TM2 recently although MTC staff will start getting back into TM2 development shortly so we could try to take a look.

Would you mind giving more context about what you're trying to do? Who is your client for this project?

saisirandas · 2021-04-25T20:45:39Z

Hi @lmz! Good to hear there is a place to ask questions.
For additional context, we are trying to run a version of the model for a private client in South Bay. We are pivoting off the previously developed TM2 for Marin county (TAMDM). In this current run, the only changes we've made from that version are to the select county (from Marin to Santa Clara) and the sampling rate.

I'm including a screenshot of the ctramp_output folder to indicate where the error occurs. It seems to occur when matrices start being written. In this case the error occurs in iteration 2.

After some additional testing and trying different combinations of memory allocations (Xmx) and number of threads/cores in jppf-clientLocal.properties, in my best attempt yet, the run gets past iteration 2 and crashes in iteration 3 (around the same time when matrices start being written).

Let me know if you have any advice for us or if there is any additional information I can provide. In the meantime, I'll look into if we can add more computing resources to the virtual machine to see if that helps.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory allocation and Java crashes #42

Memory allocation and Java crashes #42

saisirandas commented Apr 5, 2021 •

edited

Loading

lmz commented Apr 12, 2021

saisirandas commented Apr 25, 2021

Memory allocation and Java crashes #42

Memory allocation and Java crashes #42

Comments

saisirandas commented Apr 5, 2021 • edited Loading

lmz commented Apr 12, 2021

saisirandas commented Apr 25, 2021

saisirandas commented Apr 5, 2021 •

edited

Loading