DSTM Memory Management
Ok. I've decided against using a garbage collector since it will involve using external libraries and I'd like the DSTM implementation to be completely independant of any library used.
Anyway, two problems I've had (and since have solved) are below:-
Clearning the transactions
I have used reference counting to keep track of the transactions like I mentioned in a previous post. But there were still some issues when it came time to clear the transactions. After a lot of work, it turns out that the reference counting is working fine - BUT if two threads get to the point where the number of references to the transaction is 0 at the same time, one will free the transaction and succeed, while the other will fail.
The way around this problem was to implement a check just before you attempt to free thr transaction, and if someone else has freed it then don't try to free it. That sim ple.
I wish I could say the same for the test though....
Clearning Old Locators
Now this one is bad. Really BAD.
The problem here is this. We have two threads which attempt to open an object for writing. Both ones point to the old locator, and both create a new locator which they attempt to install. Now let's say transaction A just pointed to the old locator and was switched out, while transaction B managed to switch the old locator to the new one, and delete the old locator.
Now the problem is, when transaction A continues working, the locator it is poiting to has its memory deallocated, and this will obviously lead to a runtime error. The problem is, there is no way for inform the first transaction that it has been deallocated.
In java, what DSTM does, is that it doesn't free the memory, and in time the transaction will realize that its not valid any more and the garbage collector will take care of all the problems.
In C, we have to manually do all the garbage collection though. The idea would be in this case to implement reference counting again to keep track of how many transactions are pointing to each object. Which is all well until you consider the following scenrario:
Transaction A points to the old locator, it sees that it's there so it attempts to increment the reference count of the locator. Before it actually does that. Transaction B manages to successfully swap the old locator with the new. It attempts to clear the locator but it checks the reference count beforehand to make sure that it should. Having checked the reference count, it finds that its 0 (since A hasn't actually updated it yet). So it deallocates it.
Now Transaction A is up and running again, it attempts to increment the transaction count (which points to a random place in memory now), and hell breaks loose.
The problem here is that a compare-and-swap will not work, since you do not know what you should compare with.
After a lot of thought, I realized that the best way to handle it would be, dare I say, a LOCK! Yes a lock. At first this idea seemed repulsive, since the whole point here is to avoid locks! But as I thought about it more, I realized that if each transactional object (TMObject) implements its own lock, and we just lock incrementing the reference counter as well as decrementing/deallocating then it will actually be at the finest level of granularity possible. In other words, such a lock will only serialize what really has to be serialzed, and it will not serialize the portions that are fine if they ran together.
Anyway, that's what I did. James doubts that Jim would be happy with this solution. But in my humble opinion, this solution is better than using an external library GC. Moreover, even if SLE was used for this particular portion a lock would actually be better since SLE will try, and in case of contention will fail and fall back to a lock. While here we know for a fact that it WILL fail so we just go ahead and lock it from the start.
Rogue Frees
The remaining problem now is that it seems that somewhere I am freeing something I shouldn't free. I do not know if I'm actually up to investigating this today or tomorrow.
Moreover, I have this nagging feeling that I might have a memory leak somewhere. This might be cause by (this will not make any sense to people not familiar with the code) by line 245 in tobject.c . This is just a reminder in case I forget it.
Anyway, two problems I've had (and since have solved) are below:-
Clearning the transactions
I have used reference counting to keep track of the transactions like I mentioned in a previous post. But there were still some issues when it came time to clear the transactions. After a lot of work, it turns out that the reference counting is working fine - BUT if two threads get to the point where the number of references to the transaction is 0 at the same time, one will free the transaction and succeed, while the other will fail.
The way around this problem was to implement a check just before you attempt to free thr transaction, and if someone else has freed it then don't try to free it. That sim ple.
I wish I could say the same for the test though....
Clearning Old Locators
Now this one is bad. Really BAD.
The problem here is this. We have two threads which attempt to open an object for writing. Both ones point to the old locator, and both create a new locator which they attempt to install. Now let's say transaction A just pointed to the old locator and was switched out, while transaction B managed to switch the old locator to the new one, and delete the old locator.
Now the problem is, when transaction A continues working, the locator it is poiting to has its memory deallocated, and this will obviously lead to a runtime error. The problem is, there is no way for inform the first transaction that it has been deallocated.
In java, what DSTM does, is that it doesn't free the memory, and in time the transaction will realize that its not valid any more and the garbage collector will take care of all the problems.
In C, we have to manually do all the garbage collection though. The idea would be in this case to implement reference counting again to keep track of how many transactions are pointing to each object. Which is all well until you consider the following scenrario:
Transaction A points to the old locator, it sees that it's there so it attempts to increment the reference count of the locator. Before it actually does that. Transaction B manages to successfully swap the old locator with the new. It attempts to clear the locator but it checks the reference count beforehand to make sure that it should. Having checked the reference count, it finds that its 0 (since A hasn't actually updated it yet). So it deallocates it.
Now Transaction A is up and running again, it attempts to increment the transaction count (which points to a random place in memory now), and hell breaks loose.
The problem here is that a compare-and-swap will not work, since you do not know what you should compare with.
After a lot of thought, I realized that the best way to handle it would be, dare I say, a LOCK! Yes a lock. At first this idea seemed repulsive, since the whole point here is to avoid locks! But as I thought about it more, I realized that if each transactional object (TMObject) implements its own lock, and we just lock incrementing the reference counter as well as decrementing/deallocating then it will actually be at the finest level of granularity possible. In other words, such a lock will only serialize what really has to be serialzed, and it will not serialize the portions that are fine if they ran together.
Anyway, that's what I did. James doubts that Jim would be happy with this solution. But in my humble opinion, this solution is better than using an external library GC. Moreover, even if SLE was used for this particular portion a lock would actually be better since SLE will try, and in case of contention will fail and fall back to a lock. While here we know for a fact that it WILL fail so we just go ahead and lock it from the start.
Rogue Frees
The remaining problem now is that it seems that somewhere I am freeing something I shouldn't free. I do not know if I'm actually up to investigating this today or tomorrow.
Moreover, I have this nagging feeling that I might have a memory leak somewhere. This might be cause by (this will not make any sense to people not familiar with the code) by line 245 in tobject.c . This is just a reminder in case I forget it.
0 Comments:
Post a Comment
<< Home