We have an ongoing problem with a repository that we're doing bulk loads to. The imports work fine for awhile (hours or even days), but then at some point, the repo service will stop without any indication of what the error was. There's no dump and nothing at all in the repo log or log4j. Even more puzzling is after it gets in this state, we can't run large scale bulk loads for more than a matter of minutes, or at the most an hour or two before it drops again, sometimes in quick succession. It takes a reboot of the server to get it in a state where we can load for a long period of time again, but even then, it will eventually drop.
This same behavior happens with both our in house DFC tool, as well as OpenMigrate and seems to get worse the more import threads we run. For example, our in house tool can only run for a couple hours with 10 target threads, or even faster when running with 20 target threads where OpenMigrate can run for days with 4 target threads.
It seems we might have some configuration issue, but without any indication of the underlying problem through logs, it's very difficult to pinpoint the problem. Has anyone run into a similar issue? I do have a ticket open with support, but this is the second time I've opened one on this issue without resolution.